What Early M2M Can Teach Us About Modern Technology Revolutions

Years ago I was a member of the Data Interchange Standards Association (DISA), involved in electronic data interchange (EDI).  This body provided message interface specifications for common business transactions, and because of that you could say it was a very early (and widely successful) example of M2M.  I was thinking about EDI yesterday, and I wondered whether there were things about EDI, M2M, IoT, and even NFV that might be useful to connect.

In the old days, people used to send purchase orders and other commercial paper by express mail, largely because electronic copies would 1) be subject to arbitrary changes by any party and 2) because without some format standards, nobody would be able to interpret an electronic document except a human.  That kind of defeats the purpose of the exchange, or at the minimum limits its benefits.  EDI came about largely to address these two points.

One of the founding principles of EDI was a guarantee of authenticity and non-reputability.  Somehow, you had to be sure that if you sent an electronic something, your recipient didn’t diddle the fields to suit their business purposes.  The recipient had to be sure that their copy would hold up as well as a real document as a representation of your own intent.  EDI achieved this by using a trusted intermediary, a network service that received and distributed electronic commercial transactions and was always available as a source of the authentic message exchange if there was a dispute.

Message authenticity is critical in just about everything today.  Commercial EDI is still lively (more than ever, in fact) but we’re now looking for other mechanisms for guaranteeing authenticity.  The most popular of the emerging concept is the blockchain mechanism popularized by Bitcoin.  One of the things that could make blockchain useful is that it can be visualized as a kind of self-driving activity log, a message whose history, actions, and issues follow it and can always be retrieved.

If we start to visualize applications as loose chains of microservices, a worthy cloud vision for sure, we have to ask how we’d know that anyone was who they said they were and whether any request/response could actually be trusted in a commercial sense.  For services like SDN and NFV there’s the problem of making sure that transactions designed to commit or alter resources are authentic and that changes made to services that impact price can actually be traced back to somebody who’s responsible for paying.

I think we see the future of IT and networking too much through the lens of past imperfections.  IT’s history is one of moving the computer closer to the activity, culminating in my view in what I’ve called “point-of-activity empowerment”.  I’ve lived through punch-card and batch processing, and one thing I can see easily by looking back is that the difference between past and future, in application terms, is really one of dynamism.  I have to connect services to workers as needed, not expect workers to build their practices around IT support or (worse) bring their results to IT for recording.

The problem is that dynamism means loose coupling, widely ranging information flows, many-to-many relationships, and a lot of issues in authentication.  We’ve looked at cloud security largely in traditional terms, but traditional applications won’t make the cloud truly successful.  We need those point-of-activity, dynamic, applications and we need to solve their security problems in terms appropriate to a dynamic, high-performance, loosely coupled future.

The issue of formatting rules, what we’d call APIs today, was also important because it was always assumed that EDI linked software applications and not people.  You had to provide not only a common structure for an electronic transaction, you had to be sure that field codings were compatible.  The classic example from some old DISA meetings was someone who ordered 100 watermelons and had it interpreted as 100 truckloads!

One thing that tends to get overlooked here is that microservices can be a substitute for structured information.  If we think of a purchase order, for example, we think of a zillion different fields, most of which will have specific formatting requirements that we need to encode to support properly.  If we viewed a PO as a series of microservices, we’d have only a couple fields per service.  The biggest difference, though, is that it’s common to think of web service results in a format-agile way, so we have practices and tools to deal with the problems.

The web, JSON, XML, and similar related stuff also provide us guidance on how to deal with data exchanges so that structure and content are delivered in parallel.  There are also older approaches to the same goal in SOA, but one thing that seems to me to be lagging behind is a way of providing access to information in a less structured way.  It’s not as simple as “unstructured data” or even “unstructured data analytics”.  The less structure you have in information, the more free form and contextual your application if it to users/workers has to be.

The only logical way to disconnect from this level of complexity is to abstract it away.  If we have to provide workers with information about stuff to support a decision, we have to be able to facilitate the worker’s ability to interpret a wide variety of information in many different formats.  If, on the other hand, we simply supply the worker with the answer we’re looking for, then the structural issues are all buried inside the application.

This might sound like we’re limiting what workers can know by saying that “answer services” will hide basic data, but remember that if we stay with the notion of microservices we can define many different answer services that use many different information resources.  And answer services, since they provide (no surprise!) answers and not data, are inherently less complex in terms of formats and structures.

There used to be EDI networks, networks that charged enormous multiples over pure transport cost just to supply security, authentication/repudiation management, and structure.  Imagine how efficient our EDI processes would be today had we applied our technology changes simply to changing how we do those things, rather than to whether we now need to do them.  Modernizing networking, modernizing IT, means breaking out of the old molds, not just pouring new stuff into them.

Verizon’s Three-Tier Model and the Adoption of SDN/NFV

In a recent interview Verizon’s CEO talked about a three-tier strategy for the company.  You start with the best connectivity, add platforms that can drive traffic, and then add content, applications, and solutions selectively where you need an ecosystem.  While the interview was focused on mobile plans, you can see it has potentially broader value in assessing changes to networks and services.

Connectivity is access to customers and sites, which to me means that you have to focus on access infrastructure in both mobile and wireline terms.  Operators sell connectivity as their primary product.  Platforms that drive traffic are things that validate connectivity, including things like the Internet and VPNs.  Most of what we call “services” are really platforms that validate connectivity.  The top layer, where the ecosystems live, would include content and also something like IoT.

One thing this characterization of Verizon’s strategy shows is that it’s a top-down approach.  You’re not hearing about SDN or NFV or even the cloud here.  CEOs start with what they want at the business level and work downward from that, a kind of technology-planning-by-objectives approach.  This is why I think so many of our technology revolutions are in trouble; they’ve started at the bottom and are groping their way upward toward hoped-for benefits.  Certainly it shows that projects a la today and senior management buy-in are going to have to converge at some point because they’ve started off in different places.

We could apply a top-down approach to “connectivity dissection” to illustrate my point.  The priority in spending on connectivity has to be establishing the access path because absent access we have nothing.  Exploiting access should really be about service platforms, and so things like SDN and NFV would be viewed by how they enhance connectivity (by cheapening it or by augmenting the platforms we can support).  But what would these new technologies do to enhance things?  One answer, provided by Brocade, is the “every customer has their own network” model.  It’s useful to look at what this could mean then dissect it into SDN/NFV terms.

Right now, we build network services to users increasingly from shared connectivity.  We build IP networks and partition them at the IP level, so they’re not really a customer’s own network at all.  The devices that create the services are shared, in VPN form for example.  If we really wanted to give every customer their own network, we’d partition below the connectivity services, which means we’d be building the “connectivity tier” or access network without protocol specificity.  It would look like agile virtual wires.  Those wires could then be connected to virtual switches/routers to create services, and the result would be that every customer would have, in effect at least, their own network.

This raises an interesting question in reference to the broad carrier model.  Should access networks, which have been incorporating more and more IP-specific features, be instead moving toward being virtual-wire or physical-layer?  This could be created using a combination of SDN and NFV, with the former providing virtual tunnels that became our physical layer connectivity service, and the latter providing the Level 2/3 overlay connectivity map.

I wonder if operators might have been better off thinking of SDN and NFV in these terms rather than in terms of things like virtual CPE or virtual EPC.  It’s not that either of these are bad applications, but that perhaps they’d make more sense in the long run if they were put into a connectivity mission.  vCPE, for example, is a nice place to host router/switching functionality if you don’t have a traffic topology complicated enough to justify a set of internal nodes (virtual, of course).

Full connectivity is usually generated by these internal nodes, where traffic routing assures there’s a path to every destination from every source.  The alternative, which is meshing the sites with physical paths, is clearly too costly.  But what about meshing with virtual paths?  If I can multiplex traffic at the virtual-path level and insure efficient traffic-handling, could I not create full-mesh networks based on edge switching/routing alone?  This is just one example of how new concepts demand we rethink the old restrictions and paradigms.

We don’t hear much about this sort of thing, no doubt in part because it doesn’t exactly fit in with the current network incumbents’ revenue/product plans.  Software-defined networking, which could use explicit forwarding control to create truly isolated virtual paths/wires, might have been an unfortunate term if we wanted to promote all the aspects of the new technology.  If “software definition” is the goal, you could accomplish it by providing various mechanisms for path control over legacy protocols using legacy devices.

If we can use SDN and NFV in the connectivity and service layers, what about that top layer, where applications and functions and content live?  It’s already clear that networks that deliver experiences will use the lower layers as services, and in fact one of the biggest promoters of “SDN” is cloud data center technology where virtual switching is more agile in tenant control than physical devices would be.  The big question IMHO is whether concepts of SDN and NFV work their way more intimately into the cloud, creating a much broader mission for themselves and linking themselves more with new revenue.

Content delivery is really a great cloud application if you have cloud infrastructure placed in enough suitable caching locations.  If you combine cloud technology with NFV for deploying instances of cache or distribution points and SDN for creating ad hoc connections, you can not only devise a more efficient CDN, you can blend it with mobile infrastructure.  And remember that mobile networks were what Verizon’s CEO was really focusing on.

IMS means “IP Multimedia Subsystem”.  The concept has been around for almost two decades, and IMS and related technologies (like Evolved Packet Core or EPC) have been the target of a lot of virtualization projects, including many in the NFV space.  But what are we really doing with these?  The answer is that we’re replicating the function and structure of legacy mobile networks but using hosted rather than fixed-box components.  Is that something like building SDN by overlaying it on legacy devices?  I think so.

The biggest challenge that Verizon faces with its three-tier plan—the biggest challenge mobile operators face overall—is how to define mobile services and infrastructure without accepting all the limitations of the physical devices, limited-capacity paths, and rigid OSI layering that we inherited.  The IP Multimedia Subsystem shouldn’t be explicitly IP at all.  If it is, then we’re eliminating most of the benefits that it could deliver to us…and to the operators themselves.

Looking Inside the HPE-Telefonica Integration Flap

If there is a network operator who epitomizes the understanding of the real requirements for the next-generation carrier network, it is Telefonica.  If there is a vendor who has the product elements needed to make a complete business case for NFV, it’s HPE.  Yet we heard on Christmas Eve that Telefonica was terminating HPE’s integration agreement.  If Telefonica doesn’t explain further, we’ll never know for sure what’s happened here, so the best course is to explore the situation from what’s known, and see if some light dawns.

Back in the spring of 2013, when the NFV ISG had only gotten started and there were no specifications to read, Telefonica was already looking ahead to how NFV could transform its business, how it might bring cloud principles to network services.  I was asked by a big US operator in May 2013 who the operator leader worldwide in innovation might be, and I answered “Telefonica” without hesitation.

Since then, Telefonica has built an understanding of NFV from the top down, based on explicit business goals and the best understanding of what I’ll call “ecosystemic” NFV business principles that any operator possesses.  They also worked to drive a vision for open, independent, NFV in the NFV ISG.  They were the first operators to speak up in favor of intent modeling as a principle of open NFV architecture, for example.  Their UNICA network deal was the real deal in every sense of the word, the best example of a studied plan to redo networking as we know it.

HPE’s OpenNFV has led the field in terms of functional completeness since it came out in 2013.  The critical requirements for operational integration, support for legacy infrastructure, and strong service modeling were there from the first.  HPE has built a large, active, and committed partner program that includes most of the key VNF providers, and because HP is a premier cloud/server player, they have the potential to offer a complete NFV solution.  They’re also one of the six vendors who can deliver on the NFV business case.

The decision by Telefonica to make HPE their NFV integrator, made in early 2015, seemed a match made in heaven.  Since 2013 operators have been telling me that they’d prefer their primary NFV partners be server/IT companies.  To quote one operator, “We’d like to work with somebody who will make money if NFV deploys, not lose it.”  HP would obviously fit that bill, but that’s gone awry now.

“Integration” means putting stuff together.  There are a lot of ways it could be done in theory, and so integration relies heavily on an architecture to define component roles and interfaces.  If we had an architecture, we could expect vendors to fit their pieces into the appropriate slots, and minimal work should suffice to make the connections.  If we don’t have one, then every piece of technology that has to fit into NFV has to be fitted, explicitly, by somebody.  The more pieces, the more work.

If there is any operator on this earth who is dedicated to making a business case for NFV as a whole, Telefonica is that operator.  Thus, their UNICA plans are unusually sensitive to the issues that hamper NFV’s evolution to a real, complete, deployment.  Their integration needs are profound, and sensitive to the state of specifications.

The PoCs and trials that we’re undertaking, and have undertaken, are simply silo solutions in search of a unified model of deployment and management.  That model should have come from the ETSI NFV ISG, but it did not, and it won’t come from that source until we’re well into 2017 or even later.  OPNFV won’t produce it either; Telefonica hasn’t joined it perhaps for that reason.  Without a model for integration, an operator whose plans depend on ecosystemic NFV is going to run into issues.  Even vendors who have a full solution (the six I’ve named many times) still have to contend with the fact that their own solutions aren’t fully interoperable.  As Telefonica’s UNICA grows and involves more companies, it is exposed to more viewpoints that need reconciliation, and that’s going to be true for every operator who proposes NFV deployment.  If NFV is expected to transform operator infrastructure, it has to be ubiquitous, unified, and efficient across all services.

Enter now the Telefonica integration contract.  How do you produce an open approach without suitable references?  What provider of NFV solutions would happily conform to a competitor’s model?  When you have two vendors who want something different from the NFV’s VIM, or MANO, or VNFM components, how do you mediate the differences?  And if you can’t do that, don’t you end up not with “integration” but with a bunch of one-off strategies?

There are other rumors here that might shed some light.  What Light Reading reported was that 1) HPE’s deal with Telefonica was terminated, 2) that there would be a re-bid of the integration contract, and 3) that HPE could bid on it again.  That’s an unusual mix set of conditions.  It could suggest, as Light Reading has, that HPE wasn’t willing to work diligently to create an open framework for Telefonica.  I doubt this is the case, given HPE’s commitment to “open” NFV and the fact that such a move would have been credibility suicide.

What’s the answer, then?  All this could suggest that HPE didn’t know how to do the integrating; that it didn’t see that architecture.  It could also suggest that HPE and Telefonica didn’t reach a meeting of the minds on what Telefonica would do, and what steps the integrator would take.  The “I don’t know how” and “We can’t agree” options are the two most likely reasons for the Telefonica move, IMHO.

HPE, in my opinion, has the best overall NFV product set, but they have as a company relied more on supporting individual PoCs and trials than on promoting architectural unity.  If you look at their public presentations on their NFV progress you see slides that ballyhoo their success in PoCs, which given their diversity is a success in silo-building.

I believe that HPE could unify those silos, almost certainly better than others could, but they don’t seem to promote that capability even in forums like the TMF where an operational unity theme would clearly be appropriate.  I’m not saying that HPE failed to show their best side to Telefonica (nobody but HPE or Telefonica could say that), but they’re not showing it to the market overall.  That could mean they don’t see it themselves and simply were not up to the task before them.

Returning to the buyer side, Telefonica needs a complete NFV ecosystem.  The logical pathway to that is open standards and sample code, and the open and standards-generating activities have not supplied that.  If there is no global standard to integrate against, Telefonica would have to supply (and enforce) one.  The question is whether the integration contract recognized what was involved.  It means a lot of top-down models and relationships, and also getting others to go along with what you’ve defined.  Telefonica has to do the heavy lifting on that part, because no integrator would have the influence needed to force all the NFV clocks chime at the same time absent suitable specs.  Perhaps they didn’t recognize these issues, and perhaps that’s why there has to be a re-bid of the deal.

Anyone who tries to integrate NFV, to build a glorious whole from a disconnected and frankly not terribly inspiring service-specific business cases, has a challenge simply unifying their story.  Add to that the need to unify it in an open sense, wrestling cooperation from vendors who operators will admit don’t even cooperate in the formal standards processes now underway, and the challenge only gets bigger.

Not an insurmountable one, though.  The introduction of a few simple concepts, like top-down thinking, intent modeling of service hierarchies, state/event-based coordination of all management processes, and intent-modeled VIMs, could save everything, unify everything.  Telefonica could achieve all this, simply by paying somebody to do the right thing and telling everyone else to play nice or play elsewhere.  HPE and other integrator candidates could be the unifier, or one the players who take their ball and go home.

If the Light Reading reports are accurate, the fact that there’s to be a re-bid and that HPE will be invited to participate suggests that there may be at least a contributory issue with the RFP.  I think that issue is most likely related to the difficulty in integrating NFV components without a suitable model to refer to, and to use to enforce standards on all the players.  You can always custom-integrate something, but I don’t think Telefonica wants every VNF, every piece of NFVI, to launch a project to integrate it into infrastructure, whoever runs such a thing.  That’s not openness, and that proves that integration isn’t a path to an open approach, it’s an admission that you don’t have a reference to work against.

I just presented an open model for top-down NFV in four blogs, recapitulating two projects (CloudNFV and ExperiaSphere) that were both presented to NFV operators.  There are over 150 slides on that model available on the ExperiaSphere link, all developed from a top-down exploration of the benefits NFV needs.  There is no requirement for anyone to pay anything to use it or even acknowledge where they got it (if they use the term “ExperiaSphere” or the documentation they have to respect trademark and copyright, but otherwise, no restrictions).  Anyone at a loss for how to do NFV integration the right way is invited to review and use it as a whole or in parts, or invent something that does the same job and is as comprehensively documented.  “Anyone” here means Telefonica or HPE, or any of the other prospective bidders in the new integration deal.

Here’s the final truth.  Whether Telefonica had unrealistic expectations or HPE didn’t see (or didn’t want to do) what was needed, this is a vendor problem by definition.  No buyer has to convince sellers to sell to it; sellers have to convince buyers to buy.  If you field an NFV product you are responsible for making the business case for their adoption.  If that demands open architectures and integration, then you either have to provide for that or you can’t expect success.  The NFV vendor community, ultimately, has to get its house in order before next year when business drivers and open approaches are going to be paramount.  UNICA is the tip of the iceberg.

Summary of an Open NFV Model…From the Top

In my recent group of blogs I’ve covered the main issues with openness in NFV.  What I’d like to do in closing this series is describe what could be an effective and open NFV model.  There are probably a lot of ways to skin the NFV cat, but this one has been proven out in a couple of projects and so I’m pretty confident it would work.  If you have other approaches in mind, I’d suggest you test them against the behavior of the model I’m suggesting, and to make that easy I’ll define some basic principles to build on.

The first principle is top-down design.  My approach to NFV has always been top-down, because I’m a software architect by background and that’s how you’d normally develop software.  The mandate is to start by defining the benefits you need, then define an architecture that addresses those benefits effectively, and finally an implementation that meets the requirements of the architecture.

NFV needs to cost less, make more, or both.  Even a simple exploration of those goals makes it clear that the primary requirement is service automation, the second principle.  NFV has to allow services to be built and run with zero touch (which is what the TMF “ZOOM” acronym means, by the way).

Service automation means turning a service description into a running service.  In modern thinking, a service description would be called an intent model, because it defines a “black box” of components by what they’re intended to do not by the contents of the box.  NFV, then, should have two high-level pieces—a service modeling framework and a software component that turns the model into a running service.  That’s the third principle.

If you look at the notion of intent models for services you see a natural hierarchy.  There is a high-level model that could be called RetailService.  The inside of this model would be another set of models to define the functions that make up the RetailService.  For example, a VPN service would have a single “VPNService” model, and a set of “ServiceAccess” models for each access point.  This means that a service is a nested set of intent models that describe all the functions and how they relate to each other.

At some point, you decompose an intent model down to the level where what’s inside are specific resources, not other models.  In NFV, this is the point where your intent model references a Virtual Infrastructure Manager (VIM), and that VIM (more generally, an IM or Infrastructure Manager since all the infrastructure isn’t virtualized) is responsible for deployment.

Each intent model is described by a functional capability (VPN, access, etc.), by port/trunk connections, and by a service-level agreement or operating parameter set.  The set of all these parameters, these SLAs, define the state of the service overall.  My proposal has always been to consider this combined data as what I called a “MIB-Set”, and further that it be stored in a repository that isolates the real resource MIBs and allows quick formulation of arbitrary management views.  That’s the fourth principle.

Keeping that repository up-to-date is a system-wide function, starting with the polling of resource MIBs and VNF MIBs that provide raw status information.  This status information is, you may recall, processed within the VIM to create a derived MIB for the intent model(s) the VIM represents.  It follows, IMHO, that when you build an intent model of intent models, you would define the SLA/parameter variables for the new model by referencing variables in the subordinate ones.  Each higher-level model has a derived management state.  The derivations would have to be refreshed, either on demand when something changes below, or periodically.

Now, we can see how management and operations could work.  That starts by saying that each intent model is a finite-state machine, meaning that it has an operating state that is determined by the events that it recognizes.  It maintains its own state, and it can generate (via its SLA/parameter changes) events upward or downward to its superior or subordinate models.  This synchronizes all of the models in a lifecycle sense.

Anything that has to “see” management data will see the MIB-Set it’s a part of, and nothing else.  In fact, it’s my view that everything has to see only the SLA/parameter variables of its subordinate models.  The job of each model is to sustain its own state based on the state of the stuff below.

In this approach, a management system can read the MIB-Set and see what’s happening for the service at any level where it has the privilege to access.  Furthermore, any NFV process, NMS process, or OSS/BSS process can be associated with a given event in any recognized operating state of any intent model, and it would be run when that combination occurred.

With this approach there is no such thing as a “VNFM” as a specific software element.  We have a set of state/event tables and processes associated with the linkage.  The tables determine what gets run and the processes are generic not to why they are being run (as a VNFM would be) but rather to what they do.  If you scale out a VNF, for example, it doesn’t matter why you’re doing it because that was covered in the state/event table.

Service automation in this model is simply a matter of defining the “things to be done” as a set of “microservices” (to use the modern term) and then associating them with a context through state/event table definitions.  Because what happens in response to events is the essence of operations management, we’re automating operations at any level.

This circles back to the top, the benefits.  It proves that while the functional notion of how NFV has to work can be extracted from the ISG’s work, it’s unwise to extract the implementation in an explicit way.  Software designers didn’t write the documents, but software is what has to emerge from an NFV implementation process.  It also proves that you can make NFV work, and make it realize all the benefits hoped for it.

As we move into 2016 we enter a critical period for NFV.  Without a high-level framework like the one I’ve described here, we can’t unify NFV trials and deployments into a new model of infrastructure.  We can’t prevent silos and loss of efficiency, or lock-ins.  We can’t really even test pieces of NFV because we have no useful standard to test against.  What difference does it make if a piston works great if it won’t fit in the car?

It would have been better for everyone had we developed NFV from the top down and addressed benefits realization the way it should be addressed in a software project.  That didn’t happen, but while the NFV ISG has wandered too far over into defining implementation and away from its mission of providing a functional spec, we can still extract the outlines of that functional spec from the work.  We can then apply top-down principles to build the implementation.  Some vendors (the six I’ve named in prior blogs) have done that, and I hope that more will follow in 2016.  I also hope that the focus of NFV shifts to the business case, because while I believe firmly that we can make a business case for NFV, I know that we’ve not done that yet.

Making NFV’s Models and DevOps Concepts Work

When cloud computing came along, it was clear that the process of deploying multi-component applications and connecting pieces to each other and to users was complicated.  Without some form of automation, in fact, it’s likely that the deployment would be so error-prone as to threaten the stability and business case for the cloud.  What developed out of this was an extension of an existing concept—DevOps—to do the heavy lifting.  NFV needs the same thing, but likely more so.  But how would NFV “DevOps” work?

There are two models of DevOps used in the cloud.  One, the “declarative model”, defines the end-state desired and lets the software figure out how to get there.  The other, the “imperative model” defines the steps to be taken.  There is a general consensus that NFV needs a declarative approach, but none have been officially accepted, and most people don’t realize that NFV really has two different “models” to contend with.

You’ll remember from a prior blog that in the Virtual Infrastructure Manager (VIM) is (IMHO) responsible for converting an intent model of a deployment (or piece of one) into the steps needed to actually commit infrastructure.  That means that the VIM would likely employ an NFV-suitable DevOps model.  However, if services are (as I’ve asserted they must be) composed of multiple pieces that are independently deployed, then there has to be another model that describes service composition, as opposed to component deployment.

This second model level is absolutely critical to NFV success, because it’s this level of model that insures that the infrastructure and even VNF components of a service are truly open and interoperable.  Unlike the lower-level model inside the VIM, the service-level model contains logical elements of a service.  Unlike the lower-level model that’s resource-specific, the service-level model is function-specific.  Finally, the service-level model is what the NFV MANO function should work on, launching the whole of the NFV process.

My view of the service-level process is based on a pair of operations people or teams—the service and resource architects.  Service architects decide what features will be required for a service and lay out the way those features have to look, to customers and to each other.  Resource architects decide how features are deployed and connected in detail.  It’s easiest to visualize this in terms of service catalogs.

A retail service goes in the “finished goods” section of a catalog.  What it looks like is simple; it’s a model of the functional makeup of the service, connected through a set of policies to a set of resource models (minimum of one per service function) that can fulfill it.  When you order a service (directly through a portal or via a customer service rep) you extract the associated service model and send it to a deployment process (which some say is inside the OSS/BSS, some inside MANO, and some external to both).  That process uses the parameters of the service to first select implementation options, and second to drive the VIMs associated with the collective resources available to build each function.

This implies that a catalog also contains what could be called a “blueprint” section and a “parts for production” section.  The former would be the service models, both as low-level functions and as collections of functions.  The latter would be the resource models.  Service architects would build service by assembling the service model components and binding them to resource models.  The result would then go into the “finished goods” section, from which it could be ordered.

The VIM-level DevOps process would clearly look like a simple cloud deployment, and so it would be possible to use either declarative or imperative modeling for it.  I think that both will likely end up being used, and that’s fine as long as we remember a rule from the past blog:  You cannot expose deployment details in the VIM interface to MANO or you lose the free substitution of implementations of functions.  Thus, you can have an intent model that’s fulfilled under the covers by a script, but the VIM itself has to present a model (declarative) as the interface.

That means to me that the service model has to be declarative too.  Not only that, it has to be declarative in a function sense, not in the sense of describing how the deployment happens below.  It’s fine to say, in a service model, that “three access functions connect to a common VPN function”, but how any of these functions are implemented must be opaque here.  If it’s not, you don’t have a general, open, NFV service.

That may have an impact on the modeling language selected.  My own work on NFV was based on models defined using XML, but I also think that TOSCA would be an excellent modeling option.  I’m far less enthusiastic about YANG because I think that modeling approach is more suitable for declaring deployment structures, meaning being a declarative model within the VIM.  It could probably be made to work, providing that the mission of service modeling wasn’t compromised.

The boundary between service and resource modeling isn’t rigid.  The bottom of resource modeling is the intent models representing the VIMs, but it’s possible that modeling “above” that could be used to set resource-related policies and select VIMs to keep those kinds of decisions out of the service models.  In my ExperiaSphere project, I proposed a “service” and “resource” domain, but a common modeling approach in both.  I still think that’s the best approach, but it’s also possible to extend VIM modeling upward to the service/resource boundary.  I think the benefits of extending VIM modeling upward like that are limited because of the need to support all the possible VIMs representing all the possible flavors of infrastructure.  You couldn’t predict what VIM modeling would be used in the collection of VIMs you needed.

I think the service/resource domain concept is a natural one for both technical and operational reasons, and it might also form the natural boundary between OSS/BSS processes and resource/network management and “NFV” processes.  If I were an OSS/BSS player or a limited-scope provider of NFV solutions (an NFVI player, for example), I’d focus my attention on this boundary and on standardizing the intent model there.  If I were a full-service player, I’d tout my ability to model in a common way on both sides of the boundary as a differentiator.  You could then integrate virtually any real NFV implementation or silo with operations systems of any flavor.

Service/resource modeling is also likely essential for effective management of NFV-based services or in particular services with a few NFV elements.  The “service” layer can represent functional things that users expect to see and manage, and the service/resource connection is where the craft view of management intersects with the customer/CSR view.

Declarative modeling is the right approach for NFV, I think.  Not only does it map nicely to the intent model concept that’s critical in preserving abstractions that form the basis for virtualization, it’s naturally implementation-neutral.  It is very difficult to write a script (an imperative DevOps tool) that isn’t dependent on the implementation.  A good declarative modeling strategy at the service and resource level is where NFV should start, and where it’s going to have to end up if it’s ever to be fully realized.

That’s because NFV benefits in operations efficiency and service agility are critical, and these benefits depend on management.  Management is a whole new can of worms, of course, but it can be fixed if we presume a proper modeling of services and resources.  That will be my next topic in this series.

 

What an NFV Resource Pool Has to Look Like

Network functions virtualization (NFV) is supposed to be about running network service features on commodity servers using “virtualization”.  While the vCPE edge-hosting movement has demonstrated that there’s significant value in running virtual functions in other ways, virtualization and the cloud is still the “official” focus of the ETSI work, and what most think will be the primary hosting mechanism in the long term.  The infrastructure that makes up the resource pool for NFV is critically important because most NFV dollars will be spent there, so here in the second of my series on NFV openness, we’ll look at those resources.

In the ETSI E2E specification, the pool of resources that are used to host and connect virtual functions is called Network Functions Virtualization Infrastructure, or NFVI.  Services, described in some way to the Management and Orchestration (MANO) piece of NFV, are supposed to be committed to NFVI through the mechanism of one or more Virtual Infrastructure Managers (VIMs).  If we looked at NFV from a software-development perspective (which, since it’s software, we should), the VIM is the abstraction that represents resources of any sort.  That means that NFV management and orchestration really isn’t dependent on NFVI directly; it shouldn’t even know about it.  It should know only VIMs.

In the software world, NFV specifications would have made clear that anything is fine for NFVI as long as there’s a VIM that represents it.  We shouldn’t be worried about “NFVI compatibility” except in VIM terms because it’s the VIM that has to create it.  And that begs the question of what the VIM says things should be compatible with.

Deployment of virtual functions on servers via VMs or containers obviously would look a lot like what a cloud management software stack like OpenStack would do.  In fact, one could argue that a baseline reference for a VIM would be OpenStack, and in particular the hosting (Nova) and networking (Neutron) elements of it.  OpenStack assumes that you’re going to deploy a network-connected (cloud) service by defining the network elements and placing hosting points in them.  VIMs that did nothing but deploy and connect virtual functions could be little more than OpenStack or another cloud stack.

This limited view presents some problems because a “service” is almost certainly more than just a connected set of virtual functions.  There are legacy elements, things like routers and switches and DNS and DHCP—a bunch of things that are needed to run software that’s designed to offer network-related features.  Thus, a VIM should be a superset of OpenStack or other cloud stacks, doing what they do but also handling the legacy pieces.

A VIM should also present a very specific and implementation-independent abstraction of the things it can do—back to what’s now popularly called an “intent model”.  Whether I want to deploy a service chain or an IP VPN, I need to describe the goal and not the implementation.  That means that VIMs would be responsible for creating a set of specific abstractions—intent models—on whatever infrastructure they represent.  If the abstraction “vCPE” or “VPN” are defined by a VIM, then they have to be defined in the same way no matter how the VIM realizes them.  If that’s true then any NFVI can be freely substituted for another, providing the VIMs for each recognize the abstractions you plan to reference.

This is how edge-hosted VNFs should be supported; as a specialized VIM that makes deployment on a customer-sited device look like deployment as part of a cloud.  It’s also how dedicated servers, diverse cloud stack architectures, containers, and so forth should be supported.  It’s a totally open model, or at least it can be.

What threatens this nice picture is any of the following:

  1. The notion that there can be only one VIM in an implementation. If that’s the case, then every vendor would have to be supported under that VIM, and since the VIM would likely be provided by a vendor that would be unlikely to be true.  Only by recognizing multiple VIMs “under” MANO can you have open infrastructure.
  2. Any implementation-specific reference made in the abstraction(s) that describe the VIM’s capabilities. If, for example, you require that an abstraction incorporate a specific policy structure or CLI parameters, you’ve built a bridge between the model of the feature (the VIM abstraction you’re using) and a specific implementation.  That forecloses the use of that description with other VIMs.
  3. An inconsistent way of specifying the abstractions or intent models. If one vendor says that the VIM abstraction is “VPN” and another says it’s the specific device structure that makes up a VPN, then the two models can’t be used interchangeably.

All of this is important, even critical, but even this isn’t sufficient to insure that VIMs and NFVI are really open.  The other piece of the puzzle is the management side.

Even if you can deploy a “vCPE” or “VPN” using a common single model and a set of compatible VIMs, you still have the question of management.  Deployment choices have to be collected in a common model or abstraction, and so does management.  Any VIM that supports an abstract model (vCPE, etc.) has to also support an abstract management model in parallel, and whatever management and operations processes are used in NFV or in OSS/BSS/NMS have to operate on service resources through this abstraction only.

All abstract NFV features either have to be managed the same way, based on the same management abstractions, or management processes have to be tuned to the specific deployment model.  This dilemma is why the ISG ended up with the notion of VNF-specific VNF Managers.  But if we solve the problem of differences in management by deploying customized VNF managers in with the VNFs, we still have to address the question of where they get their data on resource state.  We also have to deal with how they do things like scaling in and out.  These should be abstract features of the VIM models, not scripts or hooks that link a VNF Manager directly to the internals of a VIM or NFVI.  The latter would again create non-portable models and brittle implementations.

I think the message here is simple.  Infrastructure is always represented by an intent model or models, implemented through a VIM.  The model has to define the function (“VPN”), the interfaces/ports, and the management properties, which would in effect be a MIB.  The VIM has to transliterate between these MIB variables for an instantiated model, and the resource MIBs that participate in the instantiation.  If you have this, they you can deploy and manage intent models through a VIM, and if two VIMs support the same intent model they would generate equivalent functionality and manageability.  That’s what an open model has to work like.

 

What Do We Need to Make VNFs Open and Interoperable?

In prior blogs, I’ve talked about the vCPE mission for NFV—why it’s gaining in importance and what models of vCPE might be favored.  Some people have asked (reasonably) whether there were any technical issues on NFV implementation raised by the vCPE model.  There are; some that are more exposed by vCPE than created by it, and some that are created by it.

vCPE is challenging for two primary reasons.  First, the vCPE model almost demands highly portable VNFs.  You have to be able to deploy vCPE on various platforms because there are probably multiple and fairly different) edge-hosting alternatives developing, and because the cloud has to be one of the hosting options too.  Second, vCPE isn’t an NFV service, but rather it’s an NFV piece of a broader service (like Carrier Ethernet) that already exists and will likely be deployed based on legacy technology for years to come.

Some of the developments I cited in my last blog (the Wind River and Overture announcements) are driven by the first point.  Everyone who’s tried to run a Linux application in the cloud knows that operating system and middleware version control can be critical.  OpenStack has to run on something, after all, and so do the VNFs that become either machine images or container images.  However, while platform uniformity is a necessary condition for VNF portability, it’s not a sufficient condition.

Programs rely on APIs to connect to the outside world.  Chained VNFs in vCPE are no different.  They have to connect to other things in the chain, they have to connect to the real access and network ports of the service they’re integrated with, and they have to connect to the NFV management and operations processes.  If there are “private” VNF Managers (VNFMs) involved, then there may also have to be links to the resource/infrastructure management processes.  In short, there could be a lot of APIs, like a puzzle piece with a lot of bays and peninsulas to fit other things with.

All of these are written into the virtual function software, meaning that the program expects a specific type of API that’s connected in a specific way.  vCPE because of the multiplicity of things it might run on and the multiplicity of ways it might be composed into services, probably generates more variability in terms of these bays and peninsulas than any other VNF mission.  For service chaining and vCPE VNFs to be open and portable, we’d have to standardize all these APIs or every different framework for vCPE deployment would demand a different version of VNF software.

One of these bay/peninsula combinations represents the port-side (user) and trunk-side (network) connections for the service.  Obviously the user has to be a part of a service address space that allows a given site/user to communicate with others that share the service (VPN or VLAN, for example).  These are public addresses, visible to the service users.  But do we want that visibility for the interior chaining connections, for the management pathways, for the resource connections?  Such visibility would pose a serious risk to service stability and security.  So we have to define address spaces, and keep stuff separate, to make vCPE work.

None of these issues are unique to vCPE, but vCPE surely has to have them resolved if it’s to succeed on a large scale.  There are other issues that are largely unique to vCPE, at least in one specific form, and some of these are also potentially troubling.

One example is the question of the reliability of a service chain through an insertion or deletion.  We talk about a benefit of vCPE as being the ability to support no-truck-roll in-service additions of features like firewall.  The challenge is that these features would have to be inserted into the data path.  How does one reroute a data path without impacting the data?  And it’s not just a matter of switching a new function into a chain—the things both in front of and behind that point have to be connected to the new thing.  A tunnel that once terminated in a user demarcation might now have to terminate in a firewall function.  And if you take it out, how do you know whether there’s data in-flight along the path you’ve just broken down?

Another example is on the management side.  We have a dozen or more different implementations of most of the vCPE features.  Does the user of the service have to change their management practices if we connect in a different implementation?  Sometimes even different versions of the same virtual function could present different features, and thus require a different management interface.  If we’re updating functions without truck rolls, does the user even know this happened?  How would they then adapt their management tools and practices?

Staying with management, most CPE is customer-managed.  A virtual function in the middle of a service chain has to be manageable by the user to the extent that the function would normally be so managed.  If I can tune my firewall when it’s a real device, I have to be able to do the same thing if I want when I have a virtual firewall.  But can I offer that without admitting the user to the management space where all manner of other things (mostly damaging) might be done?  Do I have to have both “public management” and “private management” ports to address the dualistic notion of CPE management?

These problems aren’t insurmountable, or even necessarily difficult, but they don’t solve themselves.  The fact that we can tune VPFs to work in a given environment is nice, but if we have to tune every VNF/environment combination it’s hard to see how we’re going to gain much agility or efficiency.

How have we gotten this far without addressing even these basic points?  Most of the work done with vCPE has either been focused on a specific partnership between hosting and VNF vendors (where the problem of multiple implementations is moot) or it’s been focused on simply deploying via OpenStack, which tends to expose management and operations processes at the infrastructure level to VNFs and VNFMs.  That breaks any real hope of interoperability or portability.

You can’t have an open, portable, model of VNFs if every VNF decides how it’s going to be deployed and managed.  At the least, this approach would demand that a service architect or customer would have to understand the specific needs of a VNF and adapt scripts or diddle with management interfaces just to connect things in a service chain.  A service catalog would be either dependent on a single vendor ecosystem where everything was selected because it made a compatible connection choice, or simply a list of things that couldn’t hope to drive automated assembly or management.

I suggested in a prior blog that we establish a fixed set of APIs that VNFs could expect to have available, creating a VNF platform-as-a-service programming environment.  The key requirement for VNFPaaS is abstraction of the features represented by each interface/API.  It’s not enough to say that a VNF or VNF Manager can access a MIB.  If the VNFM has to be parameterized or modified to work with a given MIB, then we’ve created a brittle connection that any significant variation in either element will break, and we’re done with any notion of assembling services or deploying new features on demand.

The current PoCs and trials have identified where these VNFPaaS features are needed, and with a little thinking they’d provide all the insight we need to define them and start them moving toward specification and standardization.  This is an achievable technical goal for the NFV ISG and also for OPNFV, and one both bodies should take up explicitly.  A related goal, which I’ll cover in a blog next week, is to define a similar abstraction for the Virtual Infrastructure Manager’s northbound interface.  If those two things were nailed down we could make major progress in advancing the cause of an open, optimal, model for NFV.

What’s the Best Path to Telco Open-Source Success?

Somebody asked me yesterday what I thought was the pathway to open-source success in the telco space.  I got involved in open-source and telco applications thereof back in about 2008, working within the TMF and the IPsphere Forum.  In both these bodies I was for a time the only non-telco representative in the group, and sadly neither of these efforts came to much.  Does participating in a failure qualify you as an expert on something, you might wonder.  Perhaps not, but being involved with two of them gives you an insight into why something like this isn’t easy.  You have to start at the beginning, with lawyers.

Telcos worldwide are highly regulated, and one aspect of regulation that is particularly plaguing to them is the anti-trust stuff.  It’s hard for a highly interconnected industry like telecommunications to progress if every operator goes their own way; you need to meet in the middle of every interconnect whether you’re interconnecting devices or software.  You’d think that a bunch of telcos could just form a body and hack out the requirements and specifications for doing something using open-source software.  Wrong.  That’s collusion, as their lawyers will tell them, and those same lawyers would pull them bodily out of any such activity.

Standards bodies or industry groups are exempt from this collusion risk providing their membership is open to all, so they don’t become a forum where telcos could engage in rampant anti-competitive activity.  The NFV process, launched by telcos, was surrendered to an open body (ETSI) because of this.  The problem is that open membership means that the bodies tend to become stacked with vendors, and control then passes to those vendors.  Many in the SDN and NFV worlds would admit in private that this has happened to both these technologies.

The reason I’m going through all of this is that it’s the background for the current telco open-source interest.  The telcos cannot build something on their own (many are also barred from manufacturing, in fact).  They can’t form telco-communities to build something jointly, and their efforts to build something in open standards groups have been taken over by vendors.  What’s left?  Why not open-source software?  At the very least, telcos could hope to contribute code of their own to a project and thus keep their own interests represented.  Further, open source projects could put the vendors in the position of joining and cooperating, and by doing so devaluing their own proprietary strategies, or sulking away and leaving the task to the companies who wanted it in the first place.

The hope of an open-source solution to next-gen network technology and infrastructure has traditionally been dashed by a mixture of three factors.  First, telcos have had a hard time getting funding to do coding, which leaves them at the mercy of actual contributors—usually vendors.  Second, telcos have no strong history of software design and development, and often have no idea how to go about an open-source project.  Third, telcos really love having scapegoats.  If a commercial product fails, you can sue the vendor.  Who do you sue if an open-source product failed, particularly if you were one of the contributors?  Who stands behind and supports a non-commercial, community, effort?

Open source issues have been addressed and resolved before.  Apache, OpenStack, Red Hat, and many other software projects and open-source companies have worked out a model for development and deployment that has been successful in many other industries.  That means that to the extent that there is an answer to the “best-path-to-telco-open-source” question, the answer is likely to be based either on mimicking the practices of those past successes or simply letting one of the successful firms do the job.  We could call this the “OpenStack versus Red Hat” dilemma, casting a specific company as a representative of their respective models.

OPNFV is an “OpenStack” or community approach, and I think most telcos would agree that they’d love to see this approach win.  The problem that seems to be limiting the chances of that happy outcome is the fact that rather than attacking the problem of NFV from the top down, as a real software development organization would do to fulfill a generalized set of goals, they’ve elected to presume that the ETSI ISG specifications will set the software requirements.  Those were not developed from the top down.  Furthermore, the “community” really doesn’t have a single purpose.  Some already believe others are deliberately obstructing progress, and since NFV is a threat to some big incumbents they may be right.

Red Hat isn’t currently offering an NFV solution at all, but their approach to enterprise Linux is certainly a model of success.  You pick out what you believe will be a powerful, agile, and stable set of software components and you adopt them as your platform.  No community to wrangle with, no compromises to be made.  Doing telco software this way would work if the Red-Hat-or-imitator sponsor could be relied on to make the best choices for the telco buyers.  Since open-source companies like Red Hat have no horse in the SDN or NFV race, they could very likely be expected to do just that for commercial gain alone.

I’ve done some work on mapping open-source software to SDN, NFV, and networking problems in general.  Others like Overture Networks have also done that, and I think it’s clear that you could build everything you need for next-generation management, orchestration, and operationalization of next-gen networks by assembling open-source elements.  So perhaps the path to an open-source telco future lies not in communities but in the hands of some open-source giant like Red Hat.

Consensus is great for a lot of things, but despite some myths of the ‘60s it doesn’t actually define reality.  What telco open-source needs is some picking and culling to identify the best components already available, then an earnest effort to support whatever development remains.  That’s true in the narrow area of SDN and NFV, but also in the broader area of OSS/BSS.  We have to get this process moving, which I think means recognizing that consensus processes have failed us many times already.  Successive approximation may now be the answer.  Build suites by collecting stuff and let the best approach win.

I don’t mean to suggest Red Hat is the only player who can do this, of course.  Many companies have strong open-source positions, and any of them could step up.  Until they do, I think the telco open-source initiatives are all heading down the path of past initiatives—toward an inconclusive outcome at best.

Some New Offerings in the vCPE Space: Will They Help?

If as I’ve suggested in the last week or so, vCPE is most likely to drive NFV adoption in 2016, then what happens with it is pretty important.  It’s interesting then that we really aren’t saying much about vCPE models and issues.  As we head for a major year of trials and testing, I think we should open a discussion on the topic and review some significant developments.  In fact, I propose to use three developments as a foil to address the issues.

There are two basic models of vCPE deployment.  In one, the hosting of virtual functions takes place in an edge device that acts as a kind of local, customer-specific, NFV Infrastructure (NFVI).  In the other, hosting takes place on conventional servers in the cloud.  It appears, based on carrier comments, that the former model has the best chance of gaining broad acceptance in 2016.  The reason is that cloud-hosting requires a cloud, meaning a reasonably efficient and conveniently placed resource pool.  In early deployments it could be difficult to provide that without shoveling a bunch of servers out to major offices and hoping for customers to justify it.  With edge-hosted NFV, no pool is needed and costs are incurred only when you get orders.

Edge hosting also simplifies NFV management issues, because the primary resource for function hosting isn’t shared.  You don’t have to apply complex deployment policies to decide where to host something, and once it’s hosted then all the hardware involved in the function is in a single place, represented by a single management point.  The fact is that one of the best things about edge-hosted vCPE is that it really doesn’t have to look much like NFV at all…at first.  In the longer term, there are three challenges associated with edge-hosting vCPE.

One is multiplicity of platforms.  It would be easy to develop a vast set of alternative edge devices to host on, each with their own hardware features, software platforms including middleware, and their own VNF-hosting requirements.  VNF vendors and operators might find so much edge diversity that just figuring out what version of a VNF to make or run would be a formidable task.

Challenge number two is the need to support migration to pooled resources as usage grows.  NFV has little value if all it does is create feature-agile edge devices.  Eventually operators will either develop vCPE opportunity to the point where central hosting is justified, or create resource pools for other missions (NFV or otherwise) that offer low unit costs.  In either case, you have to be able to move the functions when it makes economic sense.

The third challenge is administration of VNF licenses.  Operators will be acquiring virtual functions under a variety of commercial terms, ranging from per-use to flat fee.  They’ll likely have to be able to account for where they’ve deployed VNFs and quite possibly what they deployed them on, and this accounting will have to work both for edge-hosted vCPE and for central hosting.

On the platform-multiplicity side, we had a couple of interesting announcements recently.  Wind River has been working with virtual function partners (Wind River’s early partners include Brocade, Check Point, InfoVista and Riverbed) to develop a reference design for business-targeted edge-hosted vCPE platforms.  Called “virtual business customer premises equipment” or vBCPE, the reference design would standardize the software platform on which VNFs run.  This would almost guarantee that any device based on the reference design would run the same set of VNFs.  Since the reference design could be implemented as a small on-premises server or a cloud server, it would cover both models of vCPE deployment.

Overture Networks has a somewhat different approach to the platform problem in its Ensemble 2.0 release.  The Overture Connector is essentially a software-based business service dmarc device that includes data path, protocol, and control functionality to build a business overlay on top of any Ethernet or IP termination.  It’s based on CENTOS and KVM, and includes not only VNF deployment features but SDN control and VNF connection capabilities.  The Overture Connector can also be deployed on virtually any standard server, so it could be used at the edge or in the cloud as well.

I think that Wind River and Overture have gone about as far as you can go at this point to create a vCPE platform model.  I remain concerned about cloud-hosted VNF deployment policies and management and operations connections, but these would likely have to be addressed in central VNF management.  Overture’s Ensemble is one of the six products I believe could make the NFV business case, and Wind River is an infrastructure partner of HPE, another of those vendors.

HPE is also the source of the third announcement, a catalog-and-partner-centric vision for VNF deployment.  The jumping-off point here is the reality that vCPE (or any kind of NFV service) is likely to be based on a catalog of deployable elements, selected for use either by a customer service rep through an operations tool or by the customer through a web portal.

HP’s service catalog is designed to let either customers or CSRs pick compatible VNFs from a list, and then launch these VNFs onward through deployment into use.  HP plans a three-tier VNF partner program to represent various levels of certification, with basic onboarding stuff representing Tier One moving to advanced lifecycle management and OSS/BSS integration in Tier Three.

This is interesting because it represents a kind of top-down vision of 2016-target vCPE deployment support, from a vendor whose products can make the business case for NFV overall.  Overture, I’ve noted, has similar business-case-making breadth.  If both these vendors are going to promote a vision of vCPE that could be expanded into full operations integration and agile service creation, then it is possible that the OSS/BSS vendors aren’t the only players who could unify NFV trial-and-PoC silos down the line.

The Wind River approach, and particularly the Overture one, raise another question, which is whether purpose-built edge-hosting platforms for vCPE are a better bet than a server platform.  Overture seems to have evolved its own view from purpose-built to server by offering agile software to run on pretty much anything.  Can we build specialized CPE that would offer better features or value without doing stuff that would then tend to lock VNFs into the edge-hosted model and foreclose future cloud migration?  That point needs to be tested, and I think the vCPE space is evolving to make that testing possible.

I’d sure like to identify as many pathways to a complete NFV business case as possible.  I have to admit that I’m still uncomfortable about the notion of deploying “NFV” in vCPE form without any specific notion of how you’re going to tie the result into a broad NFV commitment.  To be sure, there will be operators who don’t expect to make such a commitment (managed service providers are the best example, but even a full-service common carrier might see NFV useful only to augment carrier Ethernet services, for example).  Most don’t feel that way, though.  Of 57 operators I’ve talked with in some detail only three indicated limited NFV deployment was “acceptable”.  These three developments don’t insure that vCPE can evolve into “real NFV” but they do facilitate that evolution, and that’s a good thing.

Optimizing the Cloud Opportunity in 2016

The cloud is surely a popular topic with the media and also with enterprises, but despite the agreement of these two groups that the cloud is interesting, they don’t agree much on what exactly is happening with the cloud.  Perhaps even more significant is the fact that they don’t agree about the future of the cloud either.

The number of “enterprises”, meaning multi-site businesses with at least ten thousand employees, who are not using the cloud is below the level of statistical significance in surveys.  However, cloud spending is still hovering at a couple percent of IT spending, and enterprises are still predicting that cloud spending won’t exceed 25% of total IT spending.  Based on this, it would be tempting to say that the cloud is radically over-hyped, but that’s a little too simple a conclusion.

The big problem we have with “the cloud” is the same as the one we have with other revolutionary technologies, like SDN or NFV or IoT.  It’s sometimes jokingly called the “groping the elephant” problem, relating to the old about trying to identify an elephant hidden behind a screen by feeling around.  Get a leg and it’s a tree; a trunk and it’s a snake, but it’s hard to grasp the totality.  So it is with the cloud, but the screen in the cloud case is the filter implicit in our presumptive model of IT.  We think in legacy terms, and legacy limitations are inherited.  The cloud won’t reach optimum penetration unless we think about it in cloud terms.

The current cloud successes have come largely by transforming web-related application processing.  Many companies have gone to cloud hosting of their web presence, and for enterprises this has reduced server spending by shifting that mission to a third party.  Remember, though, that for most enterprises web processes are front-ends to the core systems.  These systems remain almost entirely in-house, with the exception of some backup and capacity augmentation planning.  For the present at least very few enterprises see that changing much, if at all.

The interesting thing is that enterprises think their cloud spending will rise more than the potential transformation of web applications and cloudbursting could explain.  They can’t really rationalize this expectation, but at its core is the notion that there’s something the cloud could do that they are not doing now.

There is such a thing, mobility-based worker productivity enhancement.  If you survey the topic outside the context of the cloud, enterprises think that a spending bump of as much as 30% of current IT spending could be generated.  That’s a lot of money (ten times current cloud spending), but it’s not easy to harness.

Any cloud or IT product salesperson would tell you that an opportunity target like “mobility-based worker productivity enhancement” is an invitation to a sales cycle so long and with so small a chance of eventual success that they might as well just apply for unemployment.  In order to create realizable opportunity, that hazy business goal would have to be converted into something that could drive a project, and there we have the challenge not only for the cloud but for IT in general.

The cloud does not make workers more productive, but applications that run in it could do that.  Given that, what would have to be sold is first the applications that can (through mobility) further empower workers, and then the vision of hosting these apps in the cloud.  Whenever two things need to be done, both things tend to wait for the other.

Logically, the solution to cloud-based productivity enhancement should be presented as SaaS, which would make it consumable in a single step.  That sounds easy, but sit down yourself and try to diagram a cloud-based mobile-productivity application and define how it links with core business applications and repositories.  It’s hard to generalize something like that, and if there is no general solution then the project is a systems integration or software development project first, and a cloud project second.  We’re back to two sides watching each other.

The watching-face-off point is important because in a market where there’s a high barrier to early success created by implementation needs or buyer education needs (or both), every possible market-mover has to look at risk and reward.  Does a cloud provider want to step up and propose a total productivity solution?  Most would admit that’s beyond their capabilities.  But does an application provider want to propose a general strategy that would be suitable for cloud-hosting?  Could they hope to get all the money associated with their effort, or would they simply open a market that others would inevitably share and possibly even dominate?

Some market players might have enough pieces of the cloud/productivity pie to be able to move on it.  Microsoft and IBM are examples; they both have substantial application productivity experience and also cloud mindshare.  HPE, Oracle, and SAP are less obviously tied to all the pieces of the puzzle but could still create a formidable strategy from what they have.  Amazon and Google, of course, could always just buy some application smarts to drive things.  A lot of possibilities…

…all of which have been on the table all along.  The fast track option to what the cloud aficionados would call “hypergrowth” could have been taken this year, or last.  Since it wasn’t, you have to assume that there are barriers that are not clear or are considered too high for the moment at least.  That leaves us with a “slow-track” option.

Web front-ends to current applications are aimed either at those outside the company or at company workers.  In the latter case, the front-ends are obviously productivity tools, and the transformation of these tools to better suit mobile devices is a recognized requirement.  Could that be a starting-point for a more comprehensive vision of mobile productivity, suitable for cloud-hosting?  It sure could, and many companies have plans or even products in this area.

The timing is the question, because even this evolutionary approach apparently looks pretty revolutionary and scary to all the possible players.  You’d think it shouldn’t, because transformation to the cloud based on building new applications with new benefits would not erode current IT spending as moving current stuff to the cloud would.  But vendors tell me that their current initiatives toward mobile productivity are not designed for the cloud, though they hasten to say they could be hosted there.  That puts most vendors in the position of picking between an approach that sells traditional hardware and software or one that sells cloud services.  Historically we know how that goes.

What is probably going to happen is something in between our evolution and revolution choices.  Some vendor or operator is going to find a way to do enough to demonstrate compelling value, but not so much that cautious executives are driven to cover, and we might just see the signs of what and who in 2016.  It’s hard to handicap who’d do this, but my bet would be a player with both cloud and application skills.  They have the best chance of seeing that elephant behind the screen.