How To Tell NFV Software from NFV Vaporware

We’re getting a lot of NFV commentary out of the World Congress event this week, and some of it represents NFV positioning.  Most network and IT vendors have defined at least a proto-plan for NFV at this point, but a few are just starting to articulate their positions.  One is Juniper, whose reputation as a technical leader in networking has been tarnished over the last few years by insipid positioning and lack of management direction.  Juniper’s story is interesting because it illustrates one of the key problems with NFV today—we can’t easily assess what people are saying.

Juniper, in an interview published HERE, is promoting an open NFV architecture, meaning multi-vendor and exploiting open-source software.  OK, I’m with them so far.  They define three layers to NFV, the data center layer, the controller layer, and the services layer.  That sort of corresponds with the three areas of NFV I’ve been touting from the first—NFV Infrastructure, management/orchestration, and VNFs—so I can’t fault that structure either.  The problem with the Juniper position comes when you define the layers in detail to map them to the architecture.

NFVI is more than just a collection of hardware, or every data center and network would be NFVI and we’d be a long way toward deploying NFV already.  The key requirement for NFVI is that whatever resources you represent to be your contribution to NFVI, they have to be represented by a Virtual Infrastructure Manager.  A VIM takes a resource requirement from the management/orchestration of a service order and translates it to a set of commands/APIs that will actually commit the resource and establish the desired behavior.  Thus, any time a vendor says that they support NFV and tout a data center or infrastructure layer, they should offer a specific NFVI.

Does Juniper?  Well, this illustrates the next level of complexity.  Remember that the middle level of NFV is the management/orchestration (MANO) function.  This is where a service, specified in some abstract form, is decomposed into resource requests which are passed through the VIMs.  The Orchestrator function is explicit in the ETSI NFV ISG’s end-to-end model, so it has to be explicit in any vendor NFV architecture as well.  Juniper’s control layer, which sits where MANO sits in the real model, is based on their Contrail SDN controller.

SDN controllers are not Orchestrators, whoever’s you may be talking about.  In fact, SDN controllers could probably be placed in the NFVI zone given the fact that they are one way of commanding network service creation.  So you need a VIM to drive an SDN controller, and you still need an Orchestrator to drive a VIM.

Orchestrators are a pet peeve of mine, or should I say “lack of Orchestrators” are.  If you look closely at NFV you see that there are really two levels of orchestration—service orchestration and resource orchestration.  The latter is used to commit resources to a specific service feature, and the former to meld service features into a cohesive service.  OpenStack is a resource orchestration approach, and you know that because you can’t define every possible service in OpenStack, you can define pieces of service that are cloud-hosted.  Even there, the ISG specs call for things like horizontal scaling and optimization of hosting selection based on service criteria that OpenStack doesn’t offer.

There are some vendors who offer their own Orchestrators.  HP and IBM on the IT side and Alcatel-Lucent on the network vendor side have presented enough to demonstrate they actually have orchestration capability.  I’ve contended that you can combine open-source functionality (OpenTOSCA, Linked USDL, and implementations of the IETF’s i2aex) to produce most of orchestration (80% in my estimate based on my ExperiaSphere project work), so I’d have no objection to somebody calling out an open and integrated strategy based on this or some other credible combination of components.  Juniper doesn’t do that, however, and that’s true of most vendors who claim “orchestration”.

Then we get to the VNFs, and here we have issues that go beyond vendor representations.  One of the biggest holes in the current NFV approach is that it never took a top-down look at what it expected VNFs to be, or where they’d come from.  As I pointed out a couple of blogs back, VNFs should be considered as running inside a platform-as-a-service framework that was designed to present them with the management and connection features the software was written for.  There is no way to make NFV open and generalized if you can’t provide a framework in which VNF code converted or migrated from some other source can be made to run.  What exactly does it take to run a VNF?  What do we propose to offer to allow VNFs to be run?  If either of those questions could be answered, we could then say that code that met a given criteria set could be considered VNF code.  We can’t say that at the standards level at this point, nor do vendors like Juniper indicate what their framework for a VNF is.  Thus, nothing open is really possible.

What’s frustrating to me about all of this is that here we are getting the third annual white paper on the progress of NFV and we’re still dancing with respect to what makes up an NFV product.  I can’t believe that Juniper or the other vendors who are issuing NFV PR don’t know what I’ve just outlined.  I can’t believe that people covering the vendors’ announcements don’t know that you have to prove you’re doing something not just say you are.  Yet here we are, with relatively few actual NFV implementations available and any number of purported ones getting media attention.

We think NFV is a revolution.  Likewise SDN.  They’re not going to be if we can’t distinguish between a real implementation of NFV (which right now could be obtained from Alcatel-Lucent, HP, and perhaps Overture) and something that’s still more vaporware than software.  Juniper, if you have something do a PowerPoint that maps what you offer to the ETSI document and defend your mapping.  Same for the rest of the vendors.  I’m waiting for your call.

Is NFV and Cloud Computing Missing the Docker Boat

Often in our industry, a new technology gets linked with an implementation or approach and the link is so tight it constrains further evolution, even sometimes reducing utility.  This may have been the case with cloud computing and NFV, which have been bound from the first to the notion of harnessing units of compute power through virtual machines.  The truth is that other “virtualization” options have existed for ages, and some may be better suited for cloud and NFV applications.  We should probably be talking less about virtual machines and more about containers.

Hardware-level virtualization, meaning classic virtual machines, take a host and partition it via hypervisor software into what are essentially separate hardware platforms.  These act so much like real computers that you run their own operating systems on them, and facilities in the hypervisor/virtualization software make them independent in a networking sense as well.  This approach is good if you assume that you need to have the greatest level of separation possible among the tenant applications, which is why it’s popular in public cloud services.  But for private cloud, even private virtualization, it’s wasteful in resources.  Your applications probably don’t need to be protected from each other, at least no more than they would be if run in a traditional data center.

Linux containers (and other containers based on other OSs like OpenSolaris) are an alternative to virtual machines that provide application isolation within a common OS instance.  Instead of running a hypervisor “under” OS instances, containers run a virtualization shell over it, partitioning the use of resources and namespaces.  There is far less overhead than with a VM because the whole OS isn’t duplicated, and where the goal of virtualization is to create elastic pools of resources to support dynamic componentization of applications, the difference can add up to (according to one user I surveyed) a 30% savings in server costs to support the same number of virtual hosting points.  This sort of savings could be delivered either in virtualization or private cloud applications.

For NFV, containers could be an enormous benefit because many virtual network functions (VNFs) would probably not justify the cost of an autonomous VM, or such a configuration would increase deployment costs to the point where it would compromise any capex savings.  The only problem is that the DevOps processes associated with container deployment, particularly container networking, are more complicated.  Many argue that containers in their native form presume an “instance first” model, where containers are built and loaded and then networked.  This is at odds with how OpenStack has evolved; separating hosting (Nova) and networking (Neutron) lets users build networks and add host instances to them easily.  In fact, dynamic component management is probably easier with VMs than with containers, even if the popular Docker tool is used to further abstract container management.

There’s work underway to enhance container networking and DevOps.  Just today, a startup called SocketPlane announced it would be “bringing SDN to Docker”, meaning to provide the kind of operational and networking agility needed to create large-scale container deployments in public and private clouds and in NFV.  There are a few older and more limited approaches to the problem already in use.

Containers, if operationalized correctly, could have an enormous positive impact on the cloud by creating an environment that’s optimized to the future evolution of applications in the cloud instead of being optimized to support the very limited mission of server consolidation.  They could also make the difference between an NFV deployment model that ends up costing more than dedicated devices would, and one that saves capex and perhaps even could enhance operations efficiency and agility.  The challenge here is to realize the potential.

Most NFV use cases have been developed with VMs.  Since in NFV the way that virtualization hosting and networking is managed is the responsibility of the Virtual Infrastructure Manager or VIM, it is theoretically possible to make containers and container networking (including Docker) work underneath a suitable VIM, which means that it would be possible in theory to make containers work with any of the PoCs that use VM hosting today.  However, this substitution isn’t the goal or even in scope for most of the work, so we’re not developing as rich a picture of the potential for containers/Docker in NFV as I’d like.

One of the most significant questions yet to be addressed for the world of containers is the management dimension.  Anyone who’s been reading my blog knows of my ongoing concerns that NFV and cloud management is taking too easy a way out.  Shared resources demand composed, multi-tenant management practices and we’ve had little discussion of how that happens even with the de facto VM-based approaches to NFV and cloud services.  Appealing to SDN as the networking strategy doesn’t solve this problem because SDN doesn’t have a definitive management strategy that works either, at least not in my view.

The issues that containers/Docker could resolve are most evident in applications of service chaining and virtual CPE for consumers, because these NFV applications are focused on displacing edge functionality on a per-user basis, which is incredibly cost-sensitive and vulnerable to the least touch of operations inefficiency.  Even in applications of NFV where edge devices participate in feature hosting by running what are essentially cloud-boards installed in the device, the use of containers could reduce the resource needs and device costs.

While per-user applications are far from the only NFV services (shared component infrastructure for IMS, EPC, and CDNs are all operator priorities) the per-user applications will generate most of the scale of NFV deployments and also create the most dynamic set of services.  It’s probably here that the pressure for high efficiency will be felt first, and it will be interesting to see whether vendors have stepped up and explored the benefits of containers.  It could be a powerful differentiator for NFV solutions, private cloud computing, and elastic and dynamic application support.  We’ll see if any vendor gets that and exploits it effectively.

Service and Resource Management in an SDN/NFV Age

I mentioned in my blog yesterday that there was a distinct difference between “service management” and “resource management” in networks, and it’s worth taking some time to explore this because it impacts both SDN and NFV.  In fact, this difference may be at the heart of the whole notion of management transformation, the argument on whether we need “new” OSS/BSS approaches or simply need changes to current ones.

In the good old days of TDM networks, users had dedicated capacity and fixed paths.  That meant that it was possible to provide real-time information at a highly granular level, and some (like me) remember the days when you could get “severely errored seconds” and “error-free seconds” data.  When you got a service-level agreement (SLA) it could be written down to the conditions within an hour or even minute, because you had the data.

Packet networking changed all of this with the notion of a shared network and oversubscription.  One of the issues with TDM was that you paid 24×7 for capacity and might use it only for 20% or so of that time.  With packet networks, users’ traffic intermingled and this allowed more efficient use of resources.  It also meant that the notion of precise management information was forever compromised.  In packet networks, it would be very difficult and expensive to recover the exact state of routes and traffic loads at any given time.  Operators responded by extending their SLA guarantee periods—a day, a week, a month.  Packet networking is all about averages, including management of packet networks.

This is where the service/resource management differences arose.  The common principle of packet networks is to design for acceptable (within the collective SLAs) behavior and then assume it as long as all the network’s resources are operating within their design limits.  So you managed resources, but you also sought to have some mechanism of spotting issues with customer services so that you could be proactive in handling them.  Hence, the service/resource management split; you need both to offer SLAs and reasonable/acceptable levels of customer care and response.

The ability to deliver an SLA from a shared-resource packet network depends in large part on your ability to design the network to operate within a given behavioral zone, and to detect and remedy situations when it doesn’t.  That means a combination of telemetry and analytics, and the two have to be more sophisticated as the nature of the resource-sharing gets more complicated.  To the extent that SDN or NFV introduce new dimensions in resource sharing (and both clearly do) you need better telemetry and analytics to insure that you can recognize “network resource” problems and remedy them.  That gives you an acceptable response to service problems—you meet SLAs on the average, based on policies on violations that your own performance management design has set for you.

However, SDN and NFV both change the picture of resource-sharing just a bit.  First, I’ll use an SDN example.  If you assign specific forwarding paths to specific traffic from specific user endpoints, from a central control point, you presumably know where the traffic is going at any point in time.  You don’t know that in an IP network today because of adaptive routing.  So could you write a better, meaning tighter, SLA?  Perhaps.  Now for NFV, if you have a shared resource (hosted facilities) emulating a dedicated device, have you created a situation where your SLA will be less precise because your user thinks they’re managing something that’s dedicated to them, and in fact is not?

In our SDN example, we could in theory derive pretty detailed SLA data for a user’s service by looking at the status of the specific devices and trunks we’d assigned traffic to.  However, it raises the question of mechanism.  Every forwarding path, route, through an SDN network has a specific resource inventory, and we know what that is at the central control point.  But is the status of the network the sum of all the route states?  Surely, but how do we summarize and present them?  Management at the service level should now be viewed as a kind of composite, a gross state derived from the average conditions based on some algorithm, but a drill-down to a path-level state as needed.  That’s not what we have today.  And if SDN is offered using something other than central control, or if parts of the network are centralized and parts are not, how do we derive management then?

In NFV, the big question or issue is the collision of management practices and interfaces of today with virtual infrastructure.  A user can manage a dedicated device, but their management of a virtual device has to be exercised within the constraints imposed by the fact that the resources are shared.  I can never let a user or a service component exercise unfettered management on a resource that happens to host a part of their service because I have to assume that could compromise other users and services.

All of this adds up to a need for a different management view.  Logically what I want to do is to gather all the data on resource state that I can get, at all levels.  What I then have to do is to correlate that data to reflect the role of a given resource set in a given service, and present my results in an either/or/both sense.  On the one hand, I have to replicate as best I can the management interfaces that might already be consumed for pre-SDN-and-NFV services.  They still may be in use, both at the carrier and user levels.  On the other hand, I have to present the full range of data that I may now have, in a useful form, for those management applications that can benefit.  This is what “virtualizing management” means.

What we need to have, for both SDN and NFV, is a complete picture of how this resource/service management division and composition process will work.  We need to make it as flexible as we can, and to reflect the fact that things are going to get even more complicated as we evolve to realize SDN and NFV fully.

Here’s What I Mean by Top-Down NFV

I’ve talked in previous blogs about the value of a top-down approach to things like NFV, and I don’t want to appear to be throwing stones without offering a constructive example.  What I therefore propose to do now is to look at NFV in a top-down way, the way I contend a software architect would naturally approach a project that in the end is a software design project.

Top-down starts with the driving benefits and goals.  The purpose of NFV, the goal, is to permit the substitution of hosted functionality on virtualized resources for the functionality of traditional network devices.  This substitution must lower costs and increase service agility, and so it must be suitable for automated deployment and support.  A software person would see this goal in four pieces.

First and foremost, we have to define the functional platform on which hour hosted functionality will run.  I use the qualifier “functional” because it’s not necessary that virtual network functions (the hosted functionality in NFV terms) run on the same OS or physical hardware, but only that they have some specific functional resources that support them.

I contend that the goal of NFV can be achieved only if we can draw on the enormous reservoir of network features already available on popular platforms like Linux.  Therefore, I contend that the functional platform for VNFs has to be directed at replicating the connection and management framework that such a network feature would expect to have, and harnessing its capabilities to create services.

Second, we have to define a compositional abstraction that permits the creation of this functional platform.  A functional platform would be represented by a set of services offered to the VNFs, like the service of connectivity and the service of management.  These services have to be defined in abstract terms so that we can build them from whatever explicit resources we have on hand.  This is the approach taken by OpenStack’s Neutron, for example, and also by the OASIS TOSCA orchestration abstraction.

A compositional abstraction also represents what we expect the end service to be.  A “service” to the user is a black box with properties determined by its interfaces and behavior.  That’s the same thing that a service to a VNF would be, so the compositional abstraction process is both a creator of “services” and a consumer of its own abstractions at a lower level.

We host application or service components inside an envelope of connectivity, and so I think it’s obvious that we have to recognize that compositional abstractions have to include the network models that are actually used by applications today.  We build subnets, Ethernet VLANs, IP domains, and so forth, so we have to be able to define those models.  However, we shouldn’t limit the scope of our solution to the stuff we already have; a good abstraction strategy says that I could define a network model called WidgetsForward that has any forwarding properties and any addressing conventions I find useful, then map it to elements that will produce it.  A compositional abstraction, then, is a name that can be used to describe a service that’s involved in some way with a functional abstraction.

The third thing we have to define is a resource abstraction.  We have resources available, like servers or VMs, and we need to be able to define them abstractly so we can manipulate them to create our compositional abstractions.  If we have a notion of DeployVNF, that functional abstraction will have to operate using whatever cloud hosting and connectivity facilities are available from a particular cloud infrastructure, but we can’t let the specific capabilities of that infrastructure rise to the point where it’s visible to our composition process or we’ll have to change our service compositions for every resource variation.

Here we have to watch out for specific traps, one of which is to focus on device-level modeling of resources as our first step.  I don’t have anything against Yang and Netconf in their respective places, but I think that place is in defining how you do some abstract resource thing like BuildSubnet on a specific network.  You can’t let the “how” replace the “what”.  Another specific trap is presuming that everything is virtual just because some stuff will be.  Real devices will have to be a part of any realistic service for likely decades to come, and so the goal of resource abstraction is linked to the goal of functional abstraction in that what we create with VNFs has to look like what we’d create with legacy boxes.

The final thing you need is a management abstraction.  We’re forgetting, in many NFV implementations, something that operators learned years ago with router networks.  Any time you have shared resources, you have to acknowledge that service management and resource management are not the same thing.  Composing services based in the whole or in part on virtual resources is only going to make this more complicated, and how we manage services we’ve composed without collaterally composing a management view is something I don’t understand.  Largely because I don’t believe it’s possible.

Management abstractions are critical to functional platforms because you have to be able to provide real devices and real software elements with the management connections they expect, just as you need to provide them with their real inter-component or user connections.  But the connection between a VNF or router and its management framework has to be consistent with the security and stability needs of a multi-tenant infrastructure, which is what we’ll have.

If you look at just this high-level view, you can see that the thing we’re missing in most discussions about NFV is the high-level abstractions.  We have come close to making a mistake that should be obvious even semantically.  “Virtualization” is the process of turning abstractions into instantiations, and yet we’re skipping the specific abstractions.  I contend that we have to fix that problem, and fix it decisively, to make NFV work.  I contend that we’ve not fixed it with the ISG E2E conception as yet, nor have we defined fixing it as a goal of OPNFV.  This isn’t rocket science, folks.  There’s no excuse for not getting it right.

How to Avoid Management Silos in a Virtual World

The need to modernize operations practices to make them more agile and efficient is pretty obvious.  The need to organize complex software deployments, particularly those involving componentized applications, is also obvious.  So is the need to do efficient allocation of features and components to virtualized infrastructure.  What is not yet obvious is just how to do it.

A lot of this modernizing and organizing is a matter of APIs.  Generalized management tools have to accommodate a variety of network equipment, and this is usually accomplished using management APIs.  But if every vendor has their own approach, their own APIs, then there’s too much customization needed to make it likely that high-level tools could be applied to a given multi-vendor network.  Standardization is one way of addressing this; LightReading did an article on the MEF’s APIs for carrier Ethernet and CenturyLink’s goal of standardizing on these APIs.  Another approach is a vendor platform that embraces at least the major legacy vendors Cyan offers an orchestration platform for SDN/NFV that was recently expanded to include the ability to control Cisco and Juniper switches.

There are obviously a lot of vendor strategies (likely as many as there are vendors) but there are also diversions in the standards approach.  At a high level, you could divide operations efficiency tools according to whether they were “network-centric” or “OSS/BSS-centric” in their evolution.  The former seek to establish a common means of controlling devices, leaving the existing OSS/BSS to define how this common-control approach links to operations.  The latter assumes that operations systems themselves have to evolve in some way.

Any multiplicity of approach leads to user confusion and to potentially higher costs and substandard results, but it’s not this high-level division that worries me.  I’m more concerned about the service-specific approaches that both the MEF APIs and CenturyLink and the Cyan Blue Planet focus seem to suggest is evolving.

Arguably, the real value behind the IP convergence was the elimination of service-specific network hardware silos.  Five networks for five services invites not only inefficient operations but inefficient use of capacity since what’s available for one service is lost to others that don’t share the infrastructure.  But five operations/management silos on top of converged infrastructure doesn’t make much sense either, and that’s what we might end up risking here.

On the surface, it might seem very logical to develop operations tools and practices for something like carrier Ethernet.  We have a body (the MEF) focused on the service, we have specific providers who offer it, customers who depend on it, and equipment that’s designed to support it.  It’s also probably easy to make a business case for agility here, and to define very specific goals regarding the kinds of service additions and changes expected and the tolerance for cost and delay.  But thirty or more years ago, it was also easy to justify multiple service-specific networks for many of the same reasons.

Capital costs are a declining piece of overall service costs, and cost management isn’t the only path to building revenues and profit margins.  Operators in my own surveys have valued both service agility and operations efficiency higher than capital cost management for about five years now, reflecting no doubt the idea that they have a handle on how to manage capex but are a lot less sure about the agility/efficiency stuff.  We could well see a flood of initiatives to address agility/efficiency, and we could well create a lot of silos by following through on each.

I’m worried that if we start looking at how to make Ethernet services efficient, or maybe VPN services efficient, we’ll find an answer for both, but not the same answer.  The carrier Ethernet service could well have much the same equipment as pieces of cloud computing service or content delivery.  Buyers may mix multiple “services” into a retail offering.  Do we expect the people who use Ethernet in the cloud or VPNs to come up with their own management strategies or to use ones that evolve out of carrier Ethernet?  If the latter is the goal, how likely is it that we’ll be able to do what we want if we’ve given no thought to the requirements of these additional service areas?

Another source of risk here, IMHO, is the notion of managing devices and virtual devices as the path to managing services.  There is, in SDN or NFV, nothing that necessarily corresponds to a “router” or a “switch”.  There is functionality that can mimic the devices, but that functionality doesn’t likely have the properties of the devices themselves.  It may well not be localized but instead is distributed; it may even move around dynamically.  The point is that a “virtual device” is an abstraction.  We might elect to recognize virtual devices that map 1:1 to current real devices, but would we constrain our evolution forever by demanding that mapping?  If I’m using a synthesizer to create music, why say that my orchestra contains only brass, woodwinds, and strings?  A rapoor (to cite an old science-fiction yarn) might be an imaginary instrument, but if we can do whatever we imagine, why not do one?

What’s needed here is some higher-level organization, something that could in fact come (or could have come) from multiple sources.  In modern terms, we need abstractions that represent manageable elements of functionality.  In short, what we need to do is to assemble what we can control about a set of hosted or installed behaviors, and then represent them as something like a virtual device.  The problem, therefore, is not that we have virtual devices, it’s that we insist that all virtual devices are based on limited real-device elements.  That limits us to current networking concepts.

An elastic model of virtual devices lets us embrace what we have, define whatever we decide we’re evolving to, and sustain management through all the transitions.  Those are properties critical in an age where falling profits are pressuring everyone to be more efficient.  Whatever we waste in operations costs or opportunity benefits isn’t available to fund network expansion, dear vendors.

The TMF could define this.  The OMG, or the ONF, or the NFV ISG or even OPNFV, could all define something like this.  However, the time to do the defining is early in the process of building the framework for management, and we may have passed that optimum time already for most or all of these bodies.  If that’s the case, we need to think hard about the possibilities of going back to do it right.

Can Networking and IT Escape Commoditization?

I noted yesterday that HP’s decision to break itself into two companies would likely increase pressure on Cisco to fragment as well, pressure that began more than a decade ago.  Even then, the Street saw that switching and routing were low-growth businesses, which meant they’d tend to tie better product segments to the ground and hold back share appreciation.  This theme is continuing today with a report on CNBC where a VC said he expects “every large tech company” to split within five years.  He’s exaggerating, I think, but not by much.  Most will split; a few will remain intact.

All of this can be traced to something I’ll call the “Novell Syndrome”, named after the fabled star of local networking in the ‘80s.  Novell seemed to be on the cusp of explosive growth, and yet suddenly it fell from grace.  What caused it was simple; there are only so many features a buyer values enough to make a favorable purchase decision.  When you run out of new ones you can’t drive expansion in total market opportunity or refresh cycles, and your days of growth are ended.  I’ve called these factors “enablers” for years now.

In networking, the product of all our technology is bits, meaning transport/connection of information.  I remember (and a few of you likely do as well) when simply accessing a 45 Mbps pipe through your LEC would cost nearly ten thousand per month.  In round numbers we’re talking about access costing about $230 per megabit.  Today we can buy bits for about three bucks per megabit, often even less.  The point is that the Internet age made social networking, online services, and more possible by making it cheap.  We assumed a static value proposition—people wouldn’t pay more for something better—and focused on making bits cheap enough to be justified without a lot of revolutionary assumptions.  Networking went populist the easy way.

This is what created the OTT space.  Cheap bits meant capacity to exploit to deliver new experiences, and telcos didn’t do much to create those experiences.  Some of that was inertia, some was regulatory, but overall the result was the same—the “value” part of networking separated from the commodity part.  You could make the same argument about IT, where hardware is the commodity and software the value.  All a computer is good for is to run something, after all.  And many computer companies never got the memo that it was really software people wanted, which created software companies like Microsoft.

Suppose pundits are right and all tech breaks up.  The same forces are still operating; bit cost won’t be more tolerable.  What happens is that companies can’t provide naturally integrated portfolios because they don’t sell all the pieces.  That means that we’ll have to somehow put together the services and applications that are creating the value, the things that Novell lost for themselves decades ago.

The reason I’m mentioning this as we consider the ultimate commoditization of tech is that the very act of transformation that’s been driven by short-sighted moves is also creating an opportunity to be “long-sighted” again.  SDN and NFV are a lot of things depending on your perspective, but one is surely that they’re a technical mechanism for fusing the value and commodity pieces of networking and IT.  We aren’t really arguing about cost-based hosting of features or services here, no matter what people think.  Just as it was with computers and networks of old, we’re arguing about how the value is created, how those enablers are defined.

A lot of this industry is dedicated to resisting the loss of the “good old days”.  So was Novell, and it didn’t work, nor will it work now.  Cisco may well be one of those.  A decision to reorganize away from product silos for both hardware and software can be given an innovation-driven slant, but it’s probably more likely to be driven by a desire to eliminate redundant work and reduce costs.  That facilitates, not prevents, commoditization.  Some are also dedicated to applying SDN and NFV principles directly to cost reduction—both SDN and NFV organizations tend to do that now.  That gets you to that same bad place even quicker.

Our hardware and network transformations have created the same kind of imbalance between capability and ability that microprocessors and consumeristic bit pricing created.  We’re looking now for how to harness that to create stuff that’s not just a cheaper way of doing something we already do, but rather a new way of doing something we’d like and value.  But we’re not going to get there the way we’re going now, by forcing everything new and revolutionary about or best new concepts to fit the mold of old services and experiences.  How far would the Internet have gone if all we did with cheap bits was to make traditional point-to-point T3 lines cheaper?

If all the big tech companies were to break up, we’d have an industry with no leaders, but that might not be a bad thing if the leaders won’t lead us to the Promised Land.  Absent a Cisco or HP or IBM as a single point of purchase, we face the basic problem of integrating technical elements around needs.  We could expect users to get smarter (resulting in a TAM implosion since smartness is never a growth industry), we could expect them to pay for the services of integration (a non-starter given that users think everything should be free), or we could figure out how to architect the services, features, applications, and experiences in a way that would allow for easy assembly and deployment.  That, my friends, is what SDN and NFV should be about, and what both have steadfastly refused to face.

We’re asking where our inventors are, where the “innovators” of the Valley have gone.  There’s a better question.  Where are our visionaries?  I can tell you where they need to be, they need to be doing something rich and vibrant and exciting with SDN and NFV, because that is our last great hope of taking both networking and IT out of the doldrums and into a new age.

HP’s Sum-of-the-Parts Challenge

There was a time when “synergies” were a big thing in tech.  The notion of a company as a one-stop shop was considered to be both a plus from a sales efficiency perspective and a means of creating pull-through by artfully constructing feature lists on loosely related products.  No more, apparently.  The Street has been looking at big tech companies as pits of dross mixed with gold, and they want the two separated to create shareholder value.

HP has just decided to stop resisting Street pressure and split off its PC and printer business from its enterprise/service business.  The theory is that the latter, untied from the drag of low-margin and low-growth products and benefitting from undiluted management attention, would be able to shine.  It’s a nice theory, but it’s got to be proved in execution, and that may be harder than it looks.  There are two issues HP must face; shareholder value in the near term and the operational success of the units in the longer term.  How these go will depend on some key points raised, but not yet answered, by the breakup announcement.

Let’s start with the point about the “drag” of the low-margin business.  PCs and printers are in fact low-margin and that’s probably not going to get better.  But they’re also a cash cow, and HP could in theory have used some of the cash to sponsor M&A that might have helped its enterprise, cloud, and NFV aspirations.  Thus, while HP’s commodity products do hurt margins in the short term, they can fund expansion.  At the least, this suggests that HP might have better waited a bit and organized its product line on the enterprise side before it jumped.

The issue of management focus is also ambiguous at best.  It’s not like PC/printer types were running the enterprise and services businesses as sideline hobbies, after all.  Yes, senior management was balancing attention between the two units, but how much attention will the splitting up and its aftermath eat?  Anyway, I think the main point is that lower-level executives and planners are doing most of the heavy lifting in the enterprise/services space and they are already dedicated to that task.

But the big question is that “shine” thing.  What exactly is HP supposed to do on the enterprise and services side?  The classic Wall Street proposition is that HP will focus on “the cloud”, of course, but private cloud is interesting to enterprises to the extent that it reduces costs, which means reducing revenue for vendors like HP.  Winning there would therefore likely result in losing.  The big opportunity in the cloud is most likely to come from service providers, so HP should be thinking about how to harness that sector.

One way would be to recognize that the conceptual model of NFV is really a model of efficient orchestration and management of cloud-based resources, not necessarily just the creation of service features using cloud or other hosting.  HP has a pretty good NFV approach, and it wouldn’t take much for the company to turn it into a very good cloud story.

Complex applications, even if they’re deployed on static resources, need tools like DevOps to insure that all the pieces are correctly installed and connected.  If you add to this the need to deploy on virtual resources, optimization of hosting based on user location or resources, and management of dynamically assigned components that might even be shared among applications, and you need something really sophisticated.  Arguably NFV is the most complex of the current applications of management/orchestration but I’m not sure that IoT and even dynamic point-of-activity services won’t rival NFV in that regard.  And these applications take management/orchestration out of NFV/operator domains and put it squarely in the purview of the enterprise.  HP could take the lead here.

“Could” is the operative word because, of course, there was nothing preventing HP from jumping into this vision with both feet even before the notion of splitting the company came along.  Whether the split makes the execution of a full cloud vision easier or not, it likely makes it essential because it’s hard to see how the enterprise/services segment is going to shine any other way.

It nets out like this.  In the short term, there is a risk that the “old” HP, the PC/printer unit, will be subject to selling pressure as shareholders unload that portion of their stock, hedging against inevitable commoditization.  In the long term, it’s hard to see how the PC unit doesn’t get sold off to somebody, likely on less favorable terms than HP was prepared to accept earlier.  If IBM’s brand can’t pull you through in PCs, who can?  Probably not HP (or Dell, for that matter).

For HP-Enterprise, the risk is in the long term.  There’s no clear way for the breakup to facilitate improvement in HP’s server position either with enterprises or in the cloud.  Same with software, which means that services are an uphill battle.  The company has strong assets it could play, as I’ve said, but it could have played them at any point in the last year just as easily.  Which means it’s not easy for them to play those assets for some reason, and they have to fix that quickly.  The Titanic broke up before it sunk.

There’s another point to consider here, which is how this kind of breakup pressure could spread.  Cisco has been another target of Street interest in breaking out high-growth elements of a product line from lower-growth pieces.  HP’s news has already raised the Cisco point among some Street types, and the issue won’t die away easily.  The problem is that while there are always “high-growth” areas that you can point to, they often prove to be no-growth areas or have very limited total addressable market.  Telepresence, for example, was a darling of the Street when it was first raised by Cisco.  How well would Cisco as a pure telepresence play have done?  So maybe there’s something to be said for symbiosis, huh?

Can OPNFV Really Move the NFV Ball?

The Open Platform for NFV initiative appears to be getting up steam, with the support of vendors and network operators alike.  I’m a long-standing fan of open-source software and a specific advocate of it in the NFV space (see my ExperiaSphere project), but while there are some hopeful signs in OPNFV so far I’m still concerned about the direction the project will take and the good that can come of it.  To get to the questions and concerns, we need to start off with the OPNFV activity in brief.

OPNFV operates under the Linux Foundation, and it’s an open-source project rather than an alternative to the ETSI NFV ISG.  In fact, the goal of the project is to support and implement the ETSI specification.  Like other open projects (OpenDaylight for example), it has both code contributors and member companies, and the latter pay fees.  Participation outside these two categories doesn’t appear to be in the cards, which means that interested parties who don’t write code or can’t spend the member fees can influence OPNFV only via the ETSI ISG, which still permits free memberships.

The white paper on OPNFV, and the overview of the project on the website, frame the goals in terms of the ETSI E2E architecture, and specifically the combination of the NFV Infrastructure and Virtual Infrastructure Managers (NFVI and VIMs, respectively).  From here, the project will presumably move to address the other blocks of the E2E document, eventually filling in all of the blocks.  Some of the work will presumably be new development and other parts will exploit other projects, including OpenStack, Linux, DPDK, etc.

The first question that comes to mind relating to OPNFV is simple; is the ETSI NFV E2E specification enough to drive an open-source project?  As someone who was a software architect and engineer in a prior life, I’ve been honest from the very first in my concerns about one aspect of the work—it didn’t follow the traditional “top-down” approach of software projects.  I said from the outset that if you want to define something that’s going to be software, you have to do the job as a software project.

The purpose of NFV software is to run virtual functions, logically speaking.  That means that you need to decide where the virtual function software is expected to come from and what “services” you’d provide that software when it runs.  That wasn’t done with the ISG, and the OPNFV doesn’t propose to start with the VNFs either.  Thus, I can’t be sure that the most basic question—what does the NFV software do to run VNFs—can be answered.

NFVI and the VIMs are at the bottom of the process.  What they have to do is support the things above, and so starting off by defining them and developing or identifying code creates a risk that you’re building the foundation without knowing if this is a single-family ranch or a skyscraper.  And you could make the same comment regarding the management interfaces.  How can you decide how to manage resources without knowing how resource management relates to virtual function management?

There is a missing piece in the puzzle here, an important one.  What you need to do software is a software architecture, which the ETSI process never professed to be building.  An architecture links the goals, the high-level drivers of benefits, to the processes and relationships needed to fulfill them.  And if you do an architecture right, you do it in an open way so that it doesn’t constrain your avenues of approach and execution.  An open architecture is platform-independent, it admits both open-source and proprietary solutions.  We need an architecture for NFV, and that’s a basic truth.

The second question relates to what we did get, which is the specific interfaces.  Interface specifications help create open solutions, but they should be built on the architecture that links benefits to functions.  OPNFV is, according to its white paper, proposing to address the implementation of NFV by implementing the interfaces, but the lack of a unifying software architecture means the interfaces may not be sufficient.  We should think of infrastructure, the first goal of OPNFV, as a collection of resources available to an Orchestrator to build a service.  This implies that the resources are represented as a set of abstractions that can be mapped to a set of platforms and tools.  Where are these abstractions?  What features of infrastructure, whether virtual elements or behaviors of real network devices, are needed to represent services?  That would have been a fundamental element in any formal software architecture, but just implementing interfaces won’t provide these critical pieces.  Where would the notion of virtualization be today if we never had the notion of a “virtual machine?”

There’s also a lower-level question here.  If you want to support multiple cloud architectures (OpenStack, CloudStack) and if you want to support multiple hardware platforms and operating systems, you have to assume that the interface between the VIM and the NFVI elements is variable depending on what’s down in the NFVI.  How then do you define the Nf-Vi Interface, which is the one used for the VIM-to-NFVI connection?  The fact is that the interface to the VIM could be based on standard abstractions, but the interface between the VIM and the NFVI has to speak the language of the underlying elements or you have to rewrite what’s already availablew.

Finally, we have the third question, which is how long will this take?  OpenStack, OpenDaylight, have been going on for some time and yet they’re still in an early phase.  Is it possible that we’re defining an open-source platform that won’t be ready in time to make NFV an effective tool in managing carrier migration to software-and-server functionality?  Is it possible that we’ll see “member wars” within OPNFV just as we’ve seen them in other bodies, wars created when sponsors try to get the project aligned with their own interest?  Might some even be accused of becoming involved just to obstruct?

I’m not saying all of this will be an insurmountable problem, or even a problem at all.  I am saying that we don’t know at this point whether OPNFV will insure these questions don’t become problems because the nature of what they’ll produce and how they’ll work isn’t yet clear.  And my final point, time, is marching on.

So here’s my conclusion.  The ISG process should have taken a top-down approach.  The OPNFV process, by accepting the ISG model in the first place and then starting at the bottom of that model, is hardly likely to reorient all of NFV toward that optimal top-down model.  However, spec work is all theory and any implementation approach is a positive step toward solidifying the issues associated with real NFV software.  That means that at the very least we’re likely to learn a lot from the OPNFV work.  I just hope we won’t pay too much of a price along the way.