Exploring the Operations Implications of the Verizon Model

The issue of operationalizing next gen networks and services is critical for operators, and it’s thus fitting to close this week’s review of the Verizon architecture with comments on OSS/BSS integration.  There are two questions to be answered; can the approach deal with the efficiency/agility goals that will have to be met to justify SDN/NFV, and can they accommodate the political divisions over the future of OSS/BSS.

Every Tier One I know is conflicted on the issue of OSS/BSS.  It’s not that the functions themselves are not needed (you have to sell services to make money) but that the functions are wrapped in an architecture that seems in every respect to be a dinosaur.  OTTs who have adopted modern cloud and web principles seem to be a lot more efficient and agile, and thus there’s a camp within each operator who wants to toss the OSS/BSS and remake the functions along web/cloud lines.  On the other hand, every OSS/BSS expert is going to resist this sort of thing (just like router experts resist transformation to SDN) and in any event, changing out your core business systems is always going to present risks.

The Verizon paper introduces the notion of layered service abstractions, starting at the top with an end-to-end retail-driven vision and ending at the bottom with virtual features/functions/devices.  These layers can be used to assemble a spectrum of network-service-operations relationships, and if these relationships are broad enough they could cover all the bases needed for benefit generation and political consensus-building.  Do they?  Let’s use the extreme cases to see, and let’s assume that we could graph the resulting structural relationship between OSS/BSS, independent EEO, and resources as a small-letter “y” whose left, shorter, branch could connect either up top or down lower.

One extreme case is to consider the OSS/BSS system to end with the high-level retail model.  A service is a set of commercially defined functions represented by SLAs.  Those functions, from an OSS/BSS perspective, are atomic—think of them as “virtual devices” if you like.  OSS/BSS systems deploy them, and customer portals and service lifecycle management processes treat the SLA parameters as representing service and resource behavior.

If you charted this in our “y” model, the left bar would join the main bar very close to the top.  EEO, then, would generate the vision of the service that operations systems saw, reducing the role of OSS/BSS to the commercial management of the service and leaving the issues of deployment and lifecycle management to EEO.

The other extreme case is to consider the OSS/BSS system to be responsible for the lowest-level virtual function/device models.  This would open two options for our conceptual structure.  The first would dive the left bar of the “y” downward to touch the resources, and the second would make the “y” into an “l” with a single branch.

The first approach would say that while EEO should be responsible for the deployment and lifecycle management, the OSS/BSS would see the lowest-level virtual devices.  This approach is friendly to current OSS/BSS models and perhaps to the legacy TMF approach, because it retains contact between “resources” and “operations” in its current form.  We have management systems today that operate, somewhat at least, in parallel with operations, and this would perpetuate that.

The second says that it’s fine to have EEOs but they need to be inside the OSS/BSS.  That component then has a linear relationship with all the model layers of the Verizon architecture.  This, I think, is the essential model for the TMF ZOOM project, though the details of that architecture aren’t fully open to the public at this point.  It would magnify the role of OSS/BSS, obviously, and preference the OSS/BSS vendors.  It also perpetuates the OSS/BSS and its role.

If you look at these extremes, particularly in terms of my “y” topology, you see that the Verizon approach of layers of models opens the opportunity to connect the OSS/BSS in at any of the modeling layers, meaning that you can slide that left bar up and down.  It also lets operators elect to integrate the functions “above the junction” of the left bar with the OSS/BSS, turning the model into an “I”.

All this capability isn’t automatic, though.  To understand the issues of implementation, let’s move from our “y” model to another one, resembling an “H”.

The left vertical bar of our “H” represents the OSS/BSS flow and visibility, and the right the EEO or incremental orchestration view.  Both these can coexist, but to make sure they don’t end up as ships-in-the-night competitors to management, there has to be a bridge between the domains—the crossbar.  The purpose of this is to establish a kind of visibility bridge—at the model layer where this crossbar is provided, there is a set of processes that convert between the two sets of abstractions that drive the OSS/BSS and EEO flows.  What is above the crossbar is invisible to the other side, and what is below it is harmonized—reflecting the likelihood that the two verticals represent different but necessarily correlated views.

Wherever this crossbar exists, the model for the associated layer has to provide both operations-friendly and orchestration-friendly parameters to represent the status, and that has to include lifecycle state if that state has to be coordinated between EEO and OSS/BSS.  Where the bar is set higher, meaning where the model layers represent more functional and less structural abstractions, the same parameters would likely serve both sides and little work is needed.  If you drive the bar lower, then you encounter a point where it’s desirable to have the operations view composed from the EEO view.

In an operator-defined architecture like Verizon’s, there’s no need to support a range of options for positioning the bar because the operator can make a single choice.  For a general architecture, vendors and operators would have to expect that the modeling at every layer provide for the coequal viewing of OSS/BSS and EEO elements, and support the necessary parametric derivations—both in read and write terms—to provide that capability.

The bidirectionality of the bar is important because it illustrates the fact that there are two parallel paths—operations and orchestration—and that these can be harmonized in part by “exporting” functionality from one to the other.  This is the answer to the political dilemma that OSS/BSS modernization seems to pose, because if you use the bar to shunt orchestration into OSS/BSS you “modernize” it, and if you use the bar to move OSS/BSS functions (by making them orchestrable) into the orchestration side, you essentially replace the current OSS/BSS concept.

I don’t see specific evidence of this “bidirectional bar” in the Verizon approach, but I think that all of the six vendors with current full-spectrum NFV capability could provide it.  It will be interesting to see if the emergence of carrier-developed models (like those of AT&T and Verizon) will raise recognition of the multiplicity of possible OSS/BSS-to-orchestration relationships, and create some momentum for a solution that can accommodate more, or even all, the options.

The Implications and Impacts of Verizon’s End-to-End Hierarchical Modeling

It has always been my view that NFV would be better and more efficient if there were a common modeling approach from the top layer of services to the bottom layer of infrastructure.  I still feel that way, but I have serious doubts on whether such a happy situation can now arise.  The service-centric advance to NFV now seems the only path, and that advance almost guarantees a multiplicity of modeling approaches.  They might be harmonized, though, by adopting some of the principles outlined in the Verizon paper I’ve blogged about, and that’s the topic of the day.

A model is an abstraction that represents a complex interior configuration.  In NFV, the decomposition of a model into that internal complexity is the responsibility of the orchestration process.  Everywhere you have a model, you have an orchestrator, and of course of you have different modeling approaches then you have multiple orchestrators.  I’ve always felt that introduced complexity and inefficiency, which is why I favored a single one—but we’ve already noted that’s probably no longer practical.

A generalized NFV architecture should, and likely would have, contained a single modeling/orchestration implementation, but the ETSI work hasn’t defined all the layers and only a few (six) vendors have a unified architecture to date.  Further, there’s been little (well, let’s be honest, no) progress toward a full-scope NFV business case.  That’s what got us on a service-driven path, and service-driven approaches rarely develop holistic modeling/orchestration visions.  That’s because individual services don’t expose the full set of issues.

There are two pieces of good news in this.  First, most service-driven NFV starts at the bottom and goes no higher than necessary—which isn’t very far.  Second, a proper modeling and orchestration approach at a higher level can envelope and harmonize the layers below even if they’re not based on the same approach.  This is one of the features of Verizon’s End-to-End Orchestration or EEO approach, but it also applies down deeper, and it’s the basis for coercing order from service-specific NFV chaos.  But it’s not without effort and issues.

Let’s suppose we have a “classic ISG” implementation of NFV, which means that we have some sort of model that represents the VNF deployment requirements of a service, which means we have individual VNFs and the “forwarding graph” that somehow connects them.  This combination, represented using something like YANG, represents a specific model at a low level.  It’s the sort of thing you might find in a vCPE business service deployment.

Now let’s suppose we want to incorporate this in a broader service vision, one that for example includes some legacy service elements that the first model/orchestration didn’t support.  We could add a new layer of model/orchestration above the first.  If our first model is M0 we could call this second one M1 to show its relationship.  The M1 model would have to be able to properly decompose requests for legacy provisioning, and it would have to be able to recognize a model element that represented an M0 structure and decompose that structure into a request to pass a low-level model to the appropriate low-level orchestrator.  This is a hierarchical decomposition—one model can reference another as an interior element.

In my example, I assumed that the M1 model had orchestration that would directly process legacy deployments, but you could just as easily have had a second M0 level, this one for legacy, and had the M1 level reference the models of either of the two options below.  Thus, even if you assumed that you had two different ways of implementing NFV (deploying VNFs) you could still envelope them both in a higher-level model, as long as each of the two options below could be identified.  Either give them a different model element, or have the decomposition logic determine which of the two was needed.

What this shows is that layered modeling and orchestration can accomplish all sorts of useful stuff, including harmonizing different implementations or addressing the deployment of things that a given model doesn’t include/support.  And it can be carried on to any number of layers, meaning that you could orchestrate a dozen different model layers.  This (sorry to beat a dead horse here!) is another reason I liked the idea of a single model/orchestration approach.  It would have let us decompose a model using recursive processing by a single piece of software.  But, onward!

Verizon’s paper calls for two layers of service modeling, one representing the retail view of the service as it might be seen by a customer or the OSS/BSS, and the second representing the input to a connection controller (SDN controller with legacy capability).  I think it would be helpful to generalize this to allow any number of layers, and to recognize that each “leaf/branch” on the tree of a service hierarchy would pass from service-abstract to resource-abstract in its own way at its own time, subject to the higher-layer service/process synchronization needed to insure pieces get set up in the right order (which Verizon’s paper includes as a requirement).

How about Verizon’s EEO?  The Verizon paper has an interesting point in its section on E2E service descriptors:

Apart from some work in ETSI/NFV, which will be discussed below, there has not been much progress in the industry on standardizing EENSDs. That is not considered an impediment, for the following reasons:

  1. EEO functionality and sophistication will improve over time
  2. Operators can start using EEO solutions in the absence of standard EENSDs

Since an EENSD is essentially a service model, what Verizon is saying is that you could hope that the industry would converge on a standard approach there, but if it didn’t operators could still use proprietary or service-specific EEO strategies.  True, but they could also simply overlay their “EEO” models with a “super-EEO” model and harmonize that way.

I said earlier that this wasn’t necessarily a slam-dunk approach, and the reason should be obvious.  If a given Mx is to superset a lower-level model then the decomposition to that lower-level model and the invocation of its orchestration process has to be incorporated into the modeling/orchestration at the Mx level.  Somebody would have to write the code to do this, and even if we assume that the orchestrator at our Mx level is open-source, there’s still work to be done.  If it isn’t, then only the owner of the software could do the modification unless there was a kind of plug-in hook mechanism provided.

To make this kind of model-accommodation easier, the first requirement is that all modeling approaches provide the documentation (and if needed, licensing) to allow their model to be enveloped in one at a higher layer.  The second requirement is that any modeling layer have that plug-in hook or open-source structure such that it can be expanded to include the decomposition of new lower-level models.

All of this could be accomplished in two broad ways.  First, any of the six vendors with a comprehensive implantation could focus on “de-siloing” and service harmonization in their development and positioning.  Second, some standards group or open-source activity could address it as an explicit goal.  I think AT&T and Verizon have both made the goal implicit in their announced approaches, but real progress here is going to depend on somebody picking up the standard of harmonization and making a commitment.

A final interesting point is that this approach appears to offer an opportunity to offer “modeling-and-orchestration-as-a-service”.  Higher level models could be linked to cloud service portals, passing off lower-level provisioning and lifecycle management to operators’ own implementations.  This could create a whole new set of NFV opportunities and competition among model providers could move the whole service-first approach ahead, to the benefit of all.

Lessons from Taking a Service-Inward View of NFV

Getting closer to the buyer and to the dollars is always good advice in positioning a product or service.  For network operators, that means looking at what services they sell, and for network operators reviewing the potential of SDN/NFV, it means looking at how these new technologies can improve their services.  But “services” doesn’t necessarily mean “all services.”  In my last two blogs, I used a combination of operator comments to me on their view of NFV’s value and future, and Verizon’s SDN/NFV architecture paper, to suggest that operators were looking at NFV now mostly in a service-specific sense.  Holistic NFV, then, could arise as an almost-accidental byproduct of the sum of the service-specific deployments.

One question this raises is just how far “holistic NFV” can go, given that early projects might tend to wipe low-hanging benefits off the table.  Another is whether silos of NFV solutions, per service, might dilute the whole holistic notion.  I don’t propose to address either of these at this point, largely because I’ve talked about these problems in prior blogs.  What I want to do instead is look at what “service-driven NFV” might look like.

Service-driven NFV has to start with service objectives, first and foremost.  There is relatively little credibility for capex reduction as an NFV driver overall, and I think less in the case of service-driven NFV.  Few “services” offer broad opportunities to reduce capex, broad enough to impact the bottom line and justify taking some risks.  There are some credible opex benefits that might be attained on a per-service basis, but again the issue of breadth comes in.  In addition, there’s a risk that specialized NFV operations within a single service could create islands of opex practices that would end up being confusing and inefficient.  That means revenue-side, or “service agility” benefits would have to be the key.

That’s a conclusion consistent with both my operator survey and the content of the Verizon paper, I think.  Operators told me they liked mobile services and business services as NFV targets, and their specific comments focused on portal provisioning and customer care, agile deployment of incremental managed service features, etc.  The big focus in Verizon seems to be the same, with the specific adjustment that “mobile” probably means “5G”.  In fact, about half of the over-200 page Verizon paper is devoted to mobile issues and applications.

Everyone these days thinks “services” mean “portals”, and that’s true at one level.  You need to have self-service user interfaces to improve agility or the operator’s customer service processes are just delays and overhead.  However, a portal is a means of activation and presentation, and that’s all it is.  You still need to have something to process the activations and to generate what you propose to present.

If customers are going to have a portal that provides them both service lifecycle information and the ability to make changes to services or add new ones, then the critical requirement is to have a retail representation of a service that can be decomposed into infrastructure management, including the deployment of virtual functions (NFV) and the creation of ad hoc network connectivity (SDN).  In the Verizon paper this is accomplished through a series of hierarchical models.  There is a model that represents the service as a portal or operations system would offer it—the retail vision.  Another model represents the connectivity options available from the underlying infrastructure, and yet another the “abstract devices” that create the connectivity.  The models build up to or decompose from (depending your perspective) their neighbors.

Implicitly, the service-driven vision of NFV would start with the creation of this model hierarchy.  The retail presentation (analogous to the TMF’s “Product”) would decompose into functional elements (TMF Customer-Facing Services?) and then into the infrastructure connectivity elements (TMF Resource-Facing Services?) that would be built from the abstract devices.  To make this process amenable to portal-driven ordering, you’d need the model hierarchy to define the service based on all of the possible infrastructure options that might be associated with a given order, meaning that the model would have to support selective decomposition based on a combination of the order parameters and the service-versus-network topology-to-infrastructure relationships.  A portal order could then initiate a selective decomposition of the model, ending in a deployment.

Automated deployment doesn’t address all the issues, of course.  A user who depends on a portal for service orders and changes is likely to depend on it for service status as well.  Thus, it’s reasonable to assume that the retail model of the service defines parameters on which the service is judged by the buyer—the SLA.  The model would then have to create a connection between these parameters and the actual state of the service, either by providing a specific derivation of SLA statistics from lower-level model statistics, or by doing some sort of status query on an analytics database.

Both the deployment and lifecycle management activities associated with service-driven NFV pose a risk because a “service” may not expose the full set of requirements for either step, and thus new services might not fit into current models of NFV as operators seek to expand their NVF story.  Put another way, each service could end up being a one-off, sharing little in the way of software with others.  It’s even possible that early services would not develop generally reusable resource pools.

vCPE is a good example of this.  There is no question that the best model for managed service deployment would be an agile CPE device on which feature elements could be loaded.  Every model I’ve run on this suggests that it would always beat a cloud-hosted model of deployment for business service targets, which is where most operators want to use it.  Obviously, though, CPE resources to host VNFs wouldn’t be generally helpful in building a resource pool.  Less obvious is the fact that software to deploy VNFs on CPE wouldn’t have to consider all the issues of general pooled-resource infrastructure.  There’s no selection optimization, no connection of features through the network, and the derivation of management data is much easier (the CPE device knows the status of everything).

A general solution to service-specific deployments via evolving SDN/NFV technology could be created by expanding the OSS/BSS role, but Verizon’s architecture seems to focus on the opposite—containing that role.  Operators seem to think that pushing more details of SDN/NFV into the OSS/BSS is a bad move overall, and the TMF has yet to publish an open model that supports the necessary absorption.  At this point, I think that’s a dead issue, which is why I appended the question-marks on TMF references earlier.

A final consideration in a service-driven model of NFV is whether new services might be a major contributor.  Even in mobile NFV, Verizon and other operators seem to think that it would be difficult to drive NFV without a broader mobile change, like 5G, to help bear the cost and justify the disruption.  That suggests that a green-field service would be even more helpful.  IoT is the obvious one, and Nokia has recently suggested it thinks that some medical applications (which could be considered a subset of IoT) could also drive network change.  However, focusing on NFV as a platform for new services would be facilitated if there were a generic NFV model to build on, and getting that model in place may be more than new services can justify.

What We Can Learn from Verizon’s SDN/NFV Paper

Verizon has just released a white paper on its SDN/NFV strategy, developed with the help of a number of major vendors, and the paper exposes a number of interesting insights into Tier One next-gen network planning.  Some are more detailed discussions of things Verizon has revealed in the past, and some new and interesting.  This document is over 200 pages long and far too complicated to analyze here, so I’m going to focus on the high-level stuff, and along the way make some comments on the approach.  Obviously Verizon can build their network the way they want; I’m only suggesting places where I think they might change their minds later on.

The key point of the paper, I think, is that Verizon is targeting improvements in operations efficiency and service agility, which means that they’ve moved decisively away from the view that either SDN or NFV are primarily ways of saving money on infrastructure.  This is completely consistent with what operators globally have told me for the last several years; capex simply won’t deliver the benefits needed to transform the business.  And, may I add, business transformation is an explicit Verizon goal, and they target it with both SDN and NFV.

On the SDN side, Verizon is particularly focused on the independence of the control and data planes (“media planes”, as Verizon puts it, reflecting the increased focus on video delivery).  This is interesting because it validates the purist SDN-controller-and-OpenFlow model of SDN over models that leverage software control of current switches and routers.  They also say that they are expecting to use white-box products for the switches/routers in their network, but note here that “white box” means low-feature commodity products and not necessarily products from startups or from non-incumbent network vendors.  It would be up to those vendors to decide if they wanted to get into the switch game with Verizon at the expense of putting their legacy product revenue streams at risk.

On the NFV side, things are a bit less explicit at the high level.  Verizon recognizes the basic mission of NFV as that of decoupling functional software from specific appliances to allow its hosting on server pools.

One of the reasons why the NFV mission is lightweight at the high level, I think, is that Verizon includes an End to End Orchestration layer in its architecture, sitting above both NFV MANO and the SDN controller(s).  This layer is also responsible for coordinating the behavior of legacy network elements that make up parts of the service, and it demonstrates how critical it is to support current technology and even new deployments of legacy technology in SDN/NFV evolution.

Verizon also makes an interesting point regarding SDN and NFV, in the orchestration context.  NFV, they point out, is responsible for deploying virtual functions without knowing what they do—the VNFs appear as equivalent to physical devices in their vision.  Their WAN SDN Controller, in contrast, knows that a function does but doesn’t know whether it’s virtualized or not.  SDN controllers then control both virtual and physical forms of “white box” switches.

One reason for this approach is that Verizon wants an architecture that evolves to SDN/NFV based on benefits, often service-specific missions.  That relates well with the point I made in yesterday’s blog about operators looking more at SDN or NFV as a service solution than as an abstract architecture goal.  All of this magnifies the role of the service model, which in Verizon’s architecture is explicitly introduced in three places.  First, as how an OSS/BSS sees the service, which presumably is a retail-level view.  Second, as the way of describing resource-behavior-driven cooperative service elements, and third (in the form of what Verizon calls “device models”) as an abstraction for the functional units that can be mapped either to VNFs or physical network functions (PNFs).  End-to-End Orchestration (EEO) then manages the connection of models and doesn’t have to worry about the details of how each model is realized.  This is a firm vote in favor of multi-layer, divided, orchestration.

Management in the VNF sense is accommodated in the Service Assurance function, which Verizon says “collects alarm and monitoring data. Applications within SA or interfacing with SA can then use this data for fault correlation, root cause analysis, service impact analysis, SLA management, security monitoring and analytics, etc.”  This appears to be a vote for a repository model for management.  However, they don’t seem to include the EMS data for legacy elements in the repository, which I think reflects their view that how a function is realized (and thus how it is managed) is abstracted in their model.

I do have concerns about these last two points.  I think that a unified modeling approach to services is both possible and advantageous, and I think that all management information should be collected in the same way to facilitate a unified model and consistent service automation.  It may be that Verizon is recognizing that no such unified model has emerged in the space, and thus are simply accommodating the inevitability of multiple implementations.

An interesting feature of the architecture on the SDN side is the fact that Verizon has three separate SDN controller domains—access, WAN, and data center.  This, I think, is also an accommodation to the state of SDN (and NFV) progress, because a truly powerful SDN domain concept (and a related one for NFV) would support any arbitrary hierarchy of control and orchestration.  Verizon seems to be laying out its basic needs to help limit the scope of integration needed.  EEO is then responsible for harmonizing the behavior of all the domains involved in a service—including SDN, NFV, and legacy devices.

Another area where I have some concerns is in the infrastructure and virtualization piece of the architecture.  I couldn’t find an explicit statement that the architecture would support multiple infrastructure managers other than that both virtual and physical infrastructure managers are required.  But does this multiplicity also extend within each category?  If not, then it may be difficult to accommodate multi-vendor solutions given that we already have proprietary management in the physical network device sense, and that the ETSI specs aren’t detailed enough to insure that a single VIM could manage anyone’s infrastructure.

My management questions continue in the VNF Manager space.  Verizon’s statement is that “Most of the VNF Manager functions are assumed to be generic common functions applicable to any type of VNF. However, the NFV-MANO architectural framework needs to also support cases where VNF instances need specific functionality for their lifecycle management, and such functionality may be specified in the VNF Package.”  This allows an arbitrary split model of VNF management, particularly given that there are no specifications for how “generic functions” are defined or how VNF providers can support them.  It would seem that vendors could easily spin most management into something VNF-specific, which could then complicate integration and interchangeability goals.

EEO is the critical element of the architecture overall.  According to the document, “The objective of EEO is to realize zero-touch provisioning: a service instantiation request — from the operations crew or from the customer, through a self-service portal – results in an automatically executed work flow that triggers VNF instantiation, connectivity establishment and service activation.”  This appears to define a model where functions are assembled to create a service, and then lifecycle management for each function is expected to keep the service in an operating state.  However, Service Assurance interfaces with EEO to respond to SLA failures, which seems to create the potential for multi-level responses to problems that would then have to be organized through fault correlation or response analysis.  All of that could be handled through policy definition and distribution, which Verizon’s architecture also requires.

The final interesting element in the Verizon paper is its statement on Intent-Based Networking (IBN), which is their way of talking about intent modeling and the ONF initiatives in that direction.  The paper makes it clear that Verizon sees IBN as a future approach to “populized” network control rather than as a specific principle of the current architecture, but on the other hand their models (already referenced above) seem to apply intent-based principles throughout their architecture.  It may be that Verizon is using the term “IBN” to refer only to the evolving ONF initiatives, and that it expects to use intent principles in all its model layers.

The most important thing that comes out of the Verizon document, in my view, is that neither the current ONF nor NFV ISG work is sufficient to define an architecture for the deployment of even SDN and NFV (respectively) much less for the deployment of a service.  Integration testing, product assessment, and even SDN/NFV transformation planning need to look a lot further afield to be useful, and that’s going to involve in many cases making up rules rather than identifying standards.  This means, IMHO, that Verizon is not only willing but determined to move beyond the current processes and make their own way.

For players like Ericsson, this could be good news because if every operator follows the Verizon lead and defines their own next-gen architecture, there will be considerable integration work created.  That might diminish in the long term if standards bodies and open-source initiatives start to harmonize the implementation of SDN and NFV and incorporate the required higher-level concepts.

The six vendors I’ve identified as being capable of supporting a complete NFV business case could also learn something from Verizon’s paper.  One clear lesson is that a failure to develop a total benefit picture in positioning, which I think all six vendors have been guilty of, has already exacted a price.  I don’t think operators, including Verizon, would have gone so far in self-integration if they’d had a satisfactory strategy offered in productized form.  However, all six of the key NFV vendors can make the Verizon model work.  Who makes it work best?  I think Nokia wins that one.  I know one of the key Nokia contributors to the Verizon paper, and he’s both an IMS and federation expert.  And, no matter what Verizon may feel about vCPE, it is clear to me that their broad deployment of NFV will start with mobile/5G.

Overall, Verizon’s paper proves a point I discussed yesterday—vendors who want to succeed with NFV will need to have a strong service-based story that resonates with each operator’s market, and a broad architecture that covers all the bases from operations to legacy infrastructure.  Verizon has clearly taken some steps to open up the field to include many different vendors, but most operators will have to rely on a single vendor to at least get the process started, and everyone is going to want to be that vendor.  The bar has been high for that position from the first, and it’s getting higher every day.

What Buyers Think about NFV and the Cloud

I got back from a holiday to a flood of data from both enterprises and network operators/service providers—the former group focusing on cloud and network service views, and the latter group focusing on NFV.  Because all the data is so interesting I thought it was important to get it into a blog ASAP, so here we go!

Let’s start with my last group, the operators/providers.  The issue I was working on was the expectations for NFV in 2016, now that we’re a third of the day through the year.  I got responses from key people in the CIO, CFO, and CTO areas, and I had some surprises—perhaps even a few big ones.

The biggest surprise was that all three groups said they were more sure now that they would deploy some NFV infrastructure in 2016 than they had been at the start of the year.  Nearly five out of six operators responded that way, which is a pretty optimistic view of NFV progress.  What was particularly interesting was that the three key groups all responded about the same way, a sharp departure from last year when CFOs weren’t convinced there was any future in NFV at all.

The second-biggest surprise, perhaps related to the first, was that the amount of NFV infrastructure expected to be deployed was almost universally minimal.  None of the operators said they believed that NFV spending would reach 3% of capital spending even in 2017.  This suggests that operators weren’t rushing to NFV, but perhaps waving earnestly in its direction.

The reason for the limited commitment expected even in 2017 is complicated, and you could approach it in two ways—what makes up the commitment and what has impacted the planning.  Let’s take the first of these first.

There are two paths of NFV that are considered viable by all the key operator constituencies—NFV targeting business customers with premises-hosted virtual CPE, and NFV targeting mobile infrastructure.  The first of these opportunities is not seen as a driver of large-scale systemic NFV at all, but rather a way of addressing the needs of businesses better, particularly through accelerated order cycles, portals for service orders and changes, etc.  The second is seen as potentially ground-shaking in terms of impact, but NFV in mobile infrastructure is very complicated and risky, in operators’ eyes, and thus they expect to dabble a bit before making a major investment.

Add these together and you can see what operators are seeing in 2016 and 2017.  vCPE is going to happen, but nobody thinks that it’s going to shake their spending plans except perhaps MSPs who lease actual service from another operator and supplement it with CPE and management features.  Mobile applications could be very big indeed, but that very bigness means it’s going to happen at a carefully considered pace.

If all the “Cs” in the operator pantheon are in accord on these basic points, they still differ on the next one, which is the factors that got them to where they are.  The CIO and CFO organizations feel that a business case has been made for vCPE—enough said.  They also feel that there’s enough meat in mobile NFV to justify real exploration of the potential, including field trials.  But they do not believe that a broad NFV case has been made, and in fact these two groups believe that no such broad case for NFV can be made based on current technology.  The CTO, perhaps not surprisingly, thinks that NFV’s broad impact is clear and that it’s just a matter of realizing its potential.

Everyone is back in sync when it comes to what might realize NFV potential—open-source software.  In fact, the CIO and CFO are starting to think that their transformation is more generally about open-source use than about NFV in particular.  This, perhaps, is due to the fact that the CTO organizations have shifted their focus to open-source projects aiming at NFV software, from the standards process that’s still the titular basis for NFV.  Getting all the pieces of NFV turns out to involve getting a lot of related open-source stuff that can be applied to NFV, but also to other things.

Everyone has their own example of this.  The CIOs are starting to see the possibility of having OSS/BSS transformation based more on open-source tools—or at least those who are interested in OSS/BSS transformation.  It’s clear from the responses I got that CIOs are split on just how much the fundamentals of their software needs to change.  A slight majority still see their job as being simply the accommodating of SDN/NFV changes, but it seems more and more CIOs think a major change in operations software is needed.

There’s also more harmony of views than I’d expected on just how far NFV will go in the long run and what will drive it there.  Only about a quarter of CIO/CFO/CTO responses suggest that systemic infrastructure change will drive broad adoption of NFV.  The remainder say that it will be service-specific, and the service that has the most chance of driving broad deployment is mobile.  Those operators with large mobile service bases think they’ll be adopting NFV for many mobile missions, and this will then position more hosting infrastructure for other services to exploit.  Those without mobile infrastructure see NFV mostly as a business service agility tool, as I’ve said.

My impression from this is that operators have accepted a more limited mission for NFV, one that’s driven by specific service opportunities, over the notion of broad infrastructure transformation.  That doesn’t mean that we wouldn’t get to the optimum deployment of NFV, but it does suggest that getting there will be a matter of a longer cycle of evolution rather than an aggressive deployment decision.  The fact that the CIOs are not united on a course for OSS/BSS transformation seems the largest factor in this; you need those opex benefits to justify aggressive NFV roll-out.

Services that prove a broad benefit case—that’s the bottom line for NFV.

On the cloud side, enterprises’ views are related to media coverage of cloud adoption, but only just related.  Nearly all enterprises and about a third of mid-sized businesses report they use cloud services directly (not just hosting, but real cloud deployment of their apps).  Only one enterprise, a smaller one, said they had committed more than 15% of IT spending to the cloud, and none of the enterprises expected to shift more than 30%.  The cloud will be pervasive, but not decisive.

The problem seems to be that enterprises still visualize the sole benefit of cloud computing to be lower costs, in some sense at least, and that the cloud will host what they already run.  Given this picture, it’s not surprising that both Microsoft and IBM reported that while cloud sales have grown for them, it hasn’t replaced losses on the traditional IT side.  Users who adopt the cloud because it’s cheaper will always spend less.  The only way to get out of that box is to unlock new benefits with new cloud-specific techniques, but users have almost zero understanding of that potential.  Part of that is due to lack of vendor support for new productivity-benefit-driven cloud missions, and part to the fact that current enterprise IT management has grown up in a cost-driven planning age and can’t make the transition to productivity benefits easily.

There’s one common thread between the operator NFV and enterprise cloud stories, and that’s over-expectation.  Well over 90% of both groups say that they’ve had to work hard to counter expectations and perspectives that just didn’t jive with the real technology the market was presenting or the realistic benefits that technology could generate.  We live in an ad-sponsored age, where virtually everything you read is there because some seller has paid for it in some way.  That’s obviously going to promote over-positioning, and while the results aren’t fatal for either technology, I think it’s clear that NFV and the cloud would have progressed (and would continue to progress) further and faster if buyers could get a realistic view of what can be expected.

The Best Approach to SDN and NFV isn’t from ETSI or Open-Something, but From the MEF

I had a very interesting talk with the MEF and with their new CTO (Pascal Menezes), covering their “Third Network”, “Lifecycle Service Orchestration” and other things.  If you’ve read my stuff before, you know that there are many aspects of their strategy that I think are insightful, even compelling.  I’m even more sure about that after my call with them, and also more certain that they intend to exploit their approach fully.  I’m hoping that they can.

The Third Network notion comes because we have two networks today—the Internet which is an everybody-to-everybody open, best-efforts, fabric that mingles everything good and bad about technology or even society, and business connectivity which is more trustworthy, has an SLA, and supports an explicit and presumably trusted community.  One network will let me reach anything, do anything, in grand disorder. The other orders things and with that order comes a massive inertia that limits what I can do.

We live in a world where consumerism dominates more and more of technology, and where consumer marketing dominates even businesses.  An opportunity arises in moments, and is lost perhaps in days.  In the MEF vision (though not explicitly in their positioning) the over-the-top players (OTTs) have succeeded and threatened operators because the OTTs could use the everything-everywhere structure of the Internet to deliver service responses to new opportunities before operators could even schedule a meeting of all their stakeholders.

The operators can hardly abandon their networks, for all the obvious reasons.  They need to somehow adapt their network processes to something closer to market speed.  I think that the MEF concept of the Third Network reflects that goal nicely in a positioning sense.

At a technical level, it could look even better.  Suppose we take MEF slides at the high level as the model—the Third Network is an interconnection of provider facilities at Level 2 that creates a global fabric that rivals the Internet in reach without its connectivity promiscuity and its QoS disorder.  If you were to build services on the Third Network you could in theory have any arbitrary balance of Internet-like and Carrier-Ethernet-like properties and costs.  You could establish Network-as-a-Service (NaaS) in a meaningful sense.

In my view the obvious, logical, simple, and compelling architecture is to use the Third Network as the foundation for a set of overlay networks.  Call them Nicira-SDN-like, or call them tunnels, or virtual wires, or even SD-WANs.  Tunnels would create a service framework independent of the underlayment, which is important because we know that L2 connectivity isn’t agile or scalable on the level of the Internet.  The point is that these networks would use hosted nodal functionality combined with overlay tunnels to create any number of arbitrary connection networks on top of the Ethernet underlayment.  This model isn’t explicit in the MEF slides but their CTO says it’s their long-term goal.

A combination of overlay/underlay and an interconnected-metro model of the network of the future would be in my view incredibly insightful, and if it could be promoted effectively, it could be a revolution.  The MEF is the only body that seems to be articulating this model, and that makes them a player in next-gen infrastructure in itself.

What’s needed to make this happen?  The answer is two things, and two things only.  One is a public interconnection of Level 2 networks to create the underlayment.  The other is a place to host the nodal features needed to link the tunnels into virtual services.  We can host features at the user edge if needed, and we know how to do network-to-network interfaces (NNIs).  The operators could field both these things if they liked, but so could (and do, by the way) third parties like MSPs.

What would make this notion more valuable?  The answer is “the ability to provide distributed hosting for nodal functionality and other features”.  Thus, philosophically above our Third Network connection fabric would be a tightly coupled cloud fabric in which we could deploy whatever is needed to link tunnels into services and whatever features might be useful to supplement the basic connectivity models we can provide that way.  These, classically, are “LINE”, “LAN”, and “TREE”, which the MEF recognizes explicitly, as well as ACCESS and NNI.

If the Third Network is going to provide QoS, then it needs to support classes of service in the L2 underlayment, and be able to route tunnels for services onto the proper CoS.  If it’s going to provide security then it has to be sure that tunnels don’t interfere or cross-connect with each other, and that a node that establishes/connects tunnels doesn’t get hacked or doesn’t create interfering requests for resources.  All of that is well within the state of the art.  It also has to be able to support the deployment of nodes that can concentrate tunnel traffic internally to the network for efficiency, and also to host features beyond tunnel cross-connect if they’re useful.

You don’t need either SDN or NFV for this.  You can build this kind of structure today with today’s technology, probably at little incremental cost.  That to my view is the beauty of the Third Network.  If, over time, the needs of all those tunnels whizzing around and all that functionality hunkering down on hosting points can be met better with SDN or NFV, or cheaper with them, or both—then you justify an evolution.

What you do need in the near term is a means of orchestrating and managing the new overlay services.  Lifecycle Service Orchestration (LSO) is the MEF lifecycle process manager, but here I think they may have sunk themselves too far into the details.  Yes it is true that tunnels will have to be supported over legacy infrastructure (L2 in various forms, IP/MPLS in various forms), SDN, and NFV.  However, that should be only the bottom layer.  You need a mechanism for service-level orchestration because you’ve just created a service overlay independent of the real network.

The details of LSO are hard to pull from a slide deck, but it appears that it’s expected to act as a kind of overmind to the lower-level management and orchestration processes of NMS, SDN, and NFV.  If we presumed that there was a formal specification for the tunnel-management nodes that could be resident in the network (hosted in the cloud fabric) or distributed to endpoints (vCPE) then we could say this is a reasonable presentation of features.  The slides don’t show that, and in fact don’t show the specific elements for an overlay network—those tunnel-management nodes.

It all comes down to this, in my view.  If the MEF’s Third Network vision is that of an overlay network on top of a new global L2 infrastructure, then they need tunnel-management nodes and they need to orchestrate them at least as much as the stuff below (again, they assure me that this is coming).  You could simply let CoS do what’s needed, if you wanted minimalist work.  If they don’t define those tunnel-management nodes and don’t orchestrate them with LSO, then I think the whole Third Network thing starts to look like slideware.

The Third Network’s real state has special relevance in the seemingly endless battle over the business case for network evolution.  In my own view, the Third Network is a way of getting operators close to the model of future services that they need, without major fork-lift modernization or undue risk.  It could even be somewhat modular in terms of application to services and geographies.  Finally, it would potentially not only accommodate SDN and NFV but facilitate them—should it succeed.  If the Third Network fails, or succeeds only as a limited interconnect model, then operators will inevitably have to do something in the near term, and what they do might not lead as easily to SDN and NFV adoption.

This could be big, but as I’ve noted already the model isn’t really supported in detail by the MEF slideware, and in fact I had to have an email exchange with the CTO to get clarifications (particularly on the overlay model and overlay orchestration) to satisfy my requirement for written validation of claims.  He was happy to do that, and I think the MEF’s direction here is clear, but the current details are sparse because the long term is still a work in progress.

The MEF is working to re-invent itself, to find a mission for L2 and metro in an age that seems obsessed with virtualizing and orchestrating.  Forums compete just like vendors do, after all, and the results of some of this competition are fairly cynical.  I think that the MEF has responded to the media power of SDN and NFV, for example, by featuring those technologies in its Third Network, when the power of that approach is that it doesn’t rely on either, but could exploit both.  Their risk now lies in posturing too much and addressing too little, of slowing development of their critical and insightful overlay/underlay value proposition to blow kisses at technologies that are getting better ink.  There’s no time for that.

Whether the foundation of the Third Network was forum-competition opportunism or market-opportunity-realization is something we may never know, but frankly it would matter only if the outcome was questionable.  I’m more convinced than ever that the MEF is really on to something with the Third Network.  I hope they take it along the path they’ve indicated.

How to Get NFV On Track

You can certainly tell from the media coverage that progress on NFV isn’t living up to press expectations.  That’s not surprising on two fronts; first, press expectations border on an instant gratification fetish that nothing could live up to, and second that transformation of a three-trillion-dollar industry with average capital cycles of almost six years won’t happen overnight.  The interesting thing is that many operators were just as surprised as the press has been at the slow progress.  Knowing more about their perceptions might be a great step to getting NFV going, so let’s look at the views and the issues behind them.

In my recent exchanges with network operator CFO organizations, I found that almost 90% said that NFV was progressing more slowly than they had hoped.  That means that senior management in the operator space had really been committed to the idea that NFV could solve their declining profit-per-bit problems before the critical 2017 point when the figure falls low enough to compromise further investment.  They’re now concerned it won’t meet their goals.

Second point:  The same CFO organizations said that their perception was that NFV progress was slower now than in the first year NFV was launched (2013).  That means that senior management doesn’t think that NFV is moving as fast as it was, which means that as an activity it’s not closing the gap to achieving its goals.

Third point:  Even in the organizations that have been responsible for NFV development and testing, nearly three out of four say that progress has slowed and that they are less confident that “significant progress” is being made on building broad benefit case.

Final point: Operators are now betting more on open-source software and operator-driven projects than on standards and products from vendors.  Those CFO organizations said that they did not believe they would deploy NFV based on a vendor’s approach, but would instead be deploying a completely open solution.  How many?  One hundred percent.  The number was almost the same for the technologists who had driven the process.  Operators have a new horse to ride.

I’m obviously reporting a negative here, which many vendors (and some of my clients) won’t like.  Some people who read my blog have contacted me to ask why I’m “against” NFV, which I find ironic because I’ve been working to make it succeed for longer than the ETSI ISG even existed.  Further, I’ve always said (and I’ll say again here and now) that I firmly believe that a broad business case can be made for NFV deployment.  I’ve even named six vendors who can make it with their own product sets.  But you can’t fix a problem you refuse to acknowledge.  I want to fix it and so I want to acknowledge it.

The first problem was that the ETSI ISG process was an accommodation to regulatory barriers to operators working with each other to develop stuff.  I’ve run into this before; in one case operator legal departments literally said they’d close down an activity because it would be viewed as regulatory collusion as it was being run.  The collusion issue was fixed by absorption into another body (dominated by vendors) but the body never recovered its relevance.  That also happened with NFV, run inside ETSI and eventually dominated by vendors.

The second problem was that standards in a traditional sense are a poor way to define what has to be a software structure.  Software design principles are well-established; every business lives or dies on successful software after all.  These principles have to be applied by a software design process, populated by software architects.  That didn’t happen, and so we have what’s increasingly a detailed software design created indirectly and without any regard for what makes software open, agile, efficient, or even workable.

The third problem was that you can’t boil the ocean, and so early NFV work focused on two small issues—did the specific notion of deploying VNFs to create services work at the technical level, and could that be proved for a “case study”.  Technical viability should never have been questioned at all because we already had proof from commercial public cloud computing that it did work.  Case studies are helpful, but only if they represent a microcosm of the broad targets and goal sets involved in the business case.  There was never an attempt to define that broad business case, and so the case studies turned into proofs of concept that were totally service-specific.  No single service can drive infrastructure change on a broad scale.

All of this is what’s generated the seemingly ever-expanding number of “open” or “open-source” initiatives.  We have OPNFV, ONOS, OSM, OPEN-O, and operator initiatives like ECOMP from AT&T.  In addition, nearly all the vendors who have NFV solutions say their stuff is at least open, and some say it’s open-source.  The common thread here is that operators are demanding effective implementations, have lost faith that vendors will generate them on their own, and so are working through open-source to do what their legal departments wouldn’t let them do in a standards initiative.

The open-source approach is what should have been done from the first, because in theory it can be driven by software architecture and built to address the requirements first, in a top-down way.  However, software design doesn’t always proceed as it should, and so even this latest initiative could fail to deliver what’s needed.  What’s necessary to make that happen?  That’s our current question.

The goal now, for the operators and for vendors who want NFV to succeed, is to create an open model for NFV implementation and back that model with open-source implementations.  That model has to have these two specific elements:

  1. There must be an intent-model interface that identifies the relationship between the NFV MANO process and OSS/BSS/NMS, and another that defines the “Infrastructure Manager” relationship to MANO.
  2. There must be a Platform-as-a-Service (PaaS) API set that defines the “sandbox” in which all Virtual Network Functions (VNFs) run, and that provide linkage between VNFs and the rest of the NFV software.

There are three elements to NFV.  One is the infrastructure on which stuff is deployed and connected, and this is represented by an infrastructure manager (IM, in my terms, VIM for “virtual” infrastructure manager in the ETSI ISG specs).  One is the management and orchestration component itself, MANO, and one is the VNFs.  The goal is to standardize the functionality of these three things and to control the way they connect among themselves and to the outside.  This is critical in reducing integration issues and providing for open, multi-vendor, implementations.

We can’t simply collect the ETSI material into a set of specs to define my three elements and their connections; the details don’t exist in ETSI material.  This puts anything that’s firmly bound to the ETSI model at risk to being incomplete.  While an open-source implementation could expose and fix the problems, it’s not totally clear that any do (ONOS, CORD, and XOS among the open groups, or ECOMP for operators, seem most likely to be able to do what’s needed.

Vendors have to get behind this process too.  They can do so by accepting the componentization I’ve noted, and by supporting the intent models and PaaS, by simply aligning their own implementations that way.  Yes, it might end up being a pre-standards approach, but the kind of API-and-model structure I’ve noted can be transformed to a different API format without enormous difficulty—it’s done in software so often that there’s a process called an “Adapter Design Pattern” (and some slightly different but related ones too) to describe how it works.  The vendors, then, could adapt to conform to the standards that emerged from the open-source effort.  They could also still innovate in their own model if they wanted, providing they could prove the benefit and providing they still offered a standard approach.

This open approach isn’t essential in making the business case for NFV.  In some respects, it’s an impediment because it will take time for any consensus process to work out an overall architecture that fits (in general) my proposed model.  A single-vendor strategy could do that right now—six of them, in fact.  The problem is that vendors have lost the initiative now, and even if they got smart in their positioning it’s not clear that they could present a proprietary strategy that had compelling benefits.  They need an open model, a provable one.  That’s something that even those six might struggle a bit with; I don’t have enough detail on about half of the six to say for sure that they could open theirs up in a satisfactory way.  All of them will need some VNF PaaS tuning.

I think that it is totally within the capabilities of the various open-source organizations to solve the architecture-model problem and generate relevant specs and APIs, as well as reference implementations.  It is similarly well within vendor capabilities to adopt a general architecture to promote openness—like the one I’ve described here—and to promise to conform to specific standards and APIs as they are defined.  None of this would take very long, and if it were done by the end of the summer (totally feasible IMHO) then we’d remove nearly all the technical barriers to NFV deployment.  Since I think applying the new structure to the business side would also be easy, we’d quickly be able to prove a business case.

Which is why I think this impasse is so totally stupid.  How does this benefit anyone, other than perhaps a few vendors who believe that even if operators end up losing money on every service bit they carry they’ll sustain their spending or even grow it?  A small team of dedicated people could do everything needed here, and we have thousands in the industry supposedly working on it.  That makes no sense if people really want the problem solved.

My purpose here is to tell the truth as I see it, which is that we are threatening a very powerful and useful technology with extinction with no reason other than stubborn refusal to face reality.  NFV can work, and work well, if we’re determined to make that happen.



Is Ericsson’s NodePrime Deal Even Smarter Than it Looks?

Ericsson has made some pretty smart moves in the past, long before their smartness was obvious to the market.  They may have made another one with their acquisition of NodePrime, an hyperscale data center management company that could be a stepping stone for Ericsson to supremacy in a number of cloud-related markets, including of course IoT.

It seems the theory behind the deal is clear; if IoT or NFV or any other cloud-driven technology is going to succeed on a large scale, then data centers are going to explode.  Thus, dealing with that explosion would be critical in itself, but to make matters worse (or better, from Ericsson’s view) just getting to large-scale success will certainly require enormous efficiency in operationalizing the data center resources as they grow.  Hence, NodePrime.

Data centers don’t lack sources of operations statistics; there are a couple dozen primary statistics points in any given operating system and at least a half-dozen in middleware packages.  In total, one operator found they had 35 sources of data to be analyzed per server and 29 per virtual machine, and then of course there’s the network.  The basic NodePrime model is to acquire, timestamp, and organize all of this into a database that can then be used for problem and systems management and lifecycle management for installed applications.

Hyperscale data centers aren’t necessarily the result of SDN, NFV, or IoT.  While NodePrime positioning calls out that target, they also make it clear that they can manage data centers of any size, which means that they could probably both manage the individual distributed data centers operators are likely to deploy in SDN/NFV/IoT applications, but also the collective virtual data center (that’s a layer in NodePrime, in fact).  The NodePrime model also has three functional dimensions.  You can manage data centers this way.  You can manage service infrastructure this way, and feed the results into something like NFV management, and you could even build an IoT service by collecting sensor data like you collect server statistics.  I’m told that some operators have already looked at all three of these functional dimensions, and that NodePrime had said they support them all.

If we presumed that the management of an application or service was based on the analysis of the resource management data that was committed in support, then any complicated service could have insurmountable management problems.  If we presumed that a smartphone had to query a bunch of traffic sensors directly and analyze the trends and movements to figure a route, the problems are similarly insurmountable.  The fact is that any rational application based on using information has to be designed around an information analysis framework.

A framework has to do three things.  First, it has to gather information from many different sources using many different interfaces.  Second, it has to harmonize the data into a common model and timestamp and contextualize the information, and finally it has to support all the standard analytics and query tools to provide data views.  NodePrime does all of this.

The NodePrime model could represent a management abstraction very easily (datahub, in NodePrime).  The resources at the bottom are collected and correlated, passed into an analytics layer in the middle (directive) and used to create an abstraction of the resource framework that’s useful (virtual datacenter in the current model, but why not “virtual infrastructure?”)  This abstraction could then be mapped to service management, VNF management, and so forth.

It also works for IoT and contextual services.  Collect basic data at the bottom, use queries to generate abstractions that are helpful to users/applications, then expose these abstractions through microservices at the top.  NodePrime supports this too.

Well, sort of does it.  The meat of the value of NodePrime will come from the variety of information resources it can link and the range of analytics it can support.  For SDN, NFV, IoT, and other cloud applications and services, a lot of this is going to be developed by an integrator—like Ericsson.  Ericsson can enrich the capabilities of NodePrime through custom development and specialized professional services, which of course is what it likely wanted all along.

This isn’t the first time that a vendor has come at the notion of a network revolution driven by data centers and not networks.  Brocade had this message as the foundation of their NFV position in 2013 and gained a lot of traction with operators as a result.  They didn’t carry through with enough substance and they gradually lost influence.  Brocade has recently been making some acquisitions of its own, and one in the DevOps space that could arguably be an orthogonal shot at the data center space, because it’s targeting deployment and lifecycle management.

An inside-out vision of network evolution is logical, then, but it’s also a climb.  The further you are from the primary benefit case with a given technology, the longer it takes for you to build sales messaging that carries you through.  That’s been the problem with SDN and NFV, both of which in simplistic terms postulate a completely new infrastructure that would be cheaper to run and more agile.  How do you prove that without making a massive unsupported bet?

That’s where an Ericsson initiative to connect NodePrime decisively with IoT could be extravagantly valuable.  Industrial IoT isn’t really IoT at all, it’s simply an evolution of the established process control and M2M trends of the recent past.  However, the model that’s optimal for industrial IoT happens to be the only model that’s likely to be viable for “broad IoT”, and also a useful model for evolving services toward hosted components.  Ericsson could have a powerful impact with NodePrime.

The question with something like this is always “but will they?” of course.  There’s enough value in hyperscale or even normal data center management for cloud providers and operators to justify the buy without any specific mission for NFV, SDN, or IoT.  NodePrime was part of Ericsson’s Hyperscale Datacenter System 8000 before the deal was made.  However, the press release focuses on what Ericsson calls “software-defined infrastructure” in a way that seems to lead directly to NFV.

It’s not clear that Ericsson sees NodePrime’s potentially crucial role in IoT, or how success there might actually drive success with “software-defined infrastructure” by short-cutting the path to benefits and a business case.  NodePrime had some industrial IoT engagements before the Ericsson deal and Ericsson is surely aware of them, but there was no specific mention of IoT in the press release.  I had some comments from operators on the deal that suggested Ericsson/NodePrime had raised the vision with them at the sales level, but it’s not clear whether that was a feeler or the start of an initiative.

The qualifier “industrial IoT” used by NodePrime and some publications covering the story may simply reflect the fact that “industrial IoT” uses internal sensor networks and IoT fanatics aren’t prepared to let go of the notion of promiscuous Internet sensors quite yet.  We’ll have to see how this evolves.


A Service Plan for vCPE

The sweet spot for NFV so far has been virtual CPE (vCPE) and the sweet spot for vCPE has been managed services.  Nearly every operator out there has managed services ambitions at some level, but at least three out of four admit that they’re planning in a more hopeful-than-helpful sense.  Is there a right, or best, way to address the opportunity?  Yeah, there is, as you’ve no doubt guessed, and I’ve tried to assemble the guidance here.

To start with, managed service success is rooted in what could be called geo-demographics.  Prospects naturally sell themselves if they’re associated in a geographic sense, sales are easier if you have rich territories, and infrastructure and support are most efficient where customers can be concentrated rather than scattered over a wide area.  Most countries offer both consumer and business census information by locality, meaning zipcode or metro area at the least.  You start there for the geo part.

For demography, you have to look at the various MSP value propositions, which means looking at the service customer you want to turn into an MSP customer.  To start with, that’s an important point in itself because you don’t want to try to sell MSP services to somebody who isn’t connected already.  The selling cycle will be too long.  Ideally your prospect will have a service connection and have issues that can be addressed with a managed service.

Those issues are most likely to be present where there’s little or no network support expertise in-house, or perhaps even in the local labor pool.  If you want to sell a managed service you’re selling management of a service, which means you’re competing with any in-house personnel who happen to provide that already.  Since these very people are likely the ones to be assessing your offering, it can be a tough sell.  Most consumers lack network tech skill, and so do most SMBs.

There are a lot more consumers than SMBs, of course.  In the US there are about 6 million SMB sites and almost 30 times as many households.  However, business willingness to pay is much higher because the equipment and support needed to sustain business connectivity is much more expensive.  Census data that identifies the business population of a given area (I’ll use zipcode here as my example) is available, and most important that information is usually available by industrial classification (SIC, NAICS, etc.) within geography.

The reason you need the industry data is that the consumption of IT and network services varies radically by industry.  Those that consume the most on network services spend about six thousand times those that consume the least, in the US market.  What you’d like to find is an industry that spends a lot on network services and is well-represented by mid-sized businesses in your geography.

Another piece of industry data to look at is the rate of spending on integration services.  The leading industry in this category, in the US, spends about 15% of their IT budget on integration services where the last of the pack spends less than a half-percent.  If your market geography provides multi-year information you can also look at network spending and integration as a percent of IT spending growth over that period; fast growth usually puts stress on internal support.

When all else fails, look at the ratio of spending on personal computers to minicomputers and servers, meaning central IT.  Where there’s a large central IT structure there’s likely to be more technical support available, where companies with a bunch of distributed PCs often don’t have nearly the level of support the centralized gang do.

NFV vendors and analysts may like the idea of targeting specific virtual functions for your vCPE, but my data suggests that there are three broad sweet spots and that further refinement is difficult and often non-productive.  The best area is security, which includes virus scanning, encryption, and firewall services.  Second-best is facility monitoring, meaning the collection of sensor data and activation of control processes, and last is management of IT exercised through the network connection.

The limited experience available with vCPE marketing so far suggests that the best strategy is to present these three categories all at once rather than to focus on something, for the solid reason that a single application in a single area isn’t likely to generate broad buy-in on the prospect side and for such an application a device might be a viable option.  A good MSP salesperson would try to engage on at least two of the three categories, which means that pricing should favor that.  Pick a target with easy adoption in two areas and run with it.

Be very aware of the nature of the managed services you’re offering relative to the connection model of the user.  There are about 7.5 million business sites in the US, about 1.5 million of which are satellite sites of multi-site businesses.  This group is a great source of opportunity for VPN services, but try selling a VPN to a company with one location!  Also be aware that when you sell a multi-site business, you don’t sell the branches but the headquarters, so you need to check the business name online to see where the HQ is located.  If it’s out of your sales target area, forget it.

Once you have your prospects and your target service, think about fulfillment.  Your approach has to balance the cost of premises hosting on an agile appliance (vCPE) versus centralized hosting in one or more data centers.  The vCPE approach has convincing credentials as the best entry strategy because cost scales with customer base, and it’s going to retain that advantage for even fairly large deployments if the customers are widely distributed.  That’s because network connection and support costs for the tie-back of each user to a small number of data centers could quickly eradicate any economy of scale benefits.

Where this might not be true is where you have a very geographically confined customer set.  A small customer geography means that a small number of hosting points would serve customers without much hairpinning of traffic patterns to reach the hosting points.  This points out the infrastructure value of concentration; your sales types will tell you it also generates better reference accounts.  That’s particularly true if you have a limited industry focus; no reference is perceived to be as valuable as a firm of the same type in the same area.

There is a decent chance that if you can concentrate prospects you could migrate from pure vCPE to a hybrid of vCPE and cloud hosting, or completely to the cloud.  You’ll probably know whether this is even feasible based on the census data because that will give you a top-end estimate of what your prospect base could look like.

A final critical point to remember is that all my operator research shows that the critical question with new NFV services isn’t how well your implementation can do in reducing capex or offering “agility” but in how well it manages opex.  There is no practical reduction in capex that can justify a large-scale deployment of NFV absent stringent operations efficiency measures.  Even vCPE is vulnerable to opex issues, and vendors’ positioning of NFV consistently underplays opex impact.  Opex control is particularly important at the “S” end of SMB and in the consumer market, where price tolerance is so great and scale so necessarily large that even small operations issues are insurmountable.

Opex will also be critical for operators who have goals too lofty for vCPE and managed services to attain.  Funding a broader NFV and SDN base will be critical for those, and that funding cannot be secured through any realistic path other than opex efficiency.  In fact, in large NFV data centers, the operating costs per tenant could easily run three times the per-tenant capex.

My models say it is possible to make money on managed services, and in fact a decent piece of change, but it’s not just a matter of signing up a few VNF vendors and running an ad.  You’ll need a marketing campaign with proper geographic, demographic, and service targeting, and an implementation that can control operations impacts and save your profits.  All of the technical and benefit issues can be resolved using offerings from a half-dozen vendors.  The business issues are going to take leg work on your own.

Is There a Future in Augmented/Virtual Reality?

Last week there were a number of stories out on virtual reality (VR).  It’s not that the notion is new; gaming developers have tried to deliver on it for a decade or more, and Google’s Glass was an early VR-targeted product.  One interesting one was a joke.  On April 1st, Google spoofed the space with an offering it called “Cardboard Plastic”, a clear plastic visor thing that hid nothing, and did nothing.  It was a fun spoof, but that doesn’t mean that there’s nothing real about VR.  There are a dozen or more real products out there, with various capabilities.  I’m not going to focus on the design of these, but rather on the applications and impact.

From an application perspective, VR’s most common applications are gaming or presenting people with a visual field that includes their texts, which is a kind of light integration with “real” reality.  These combine to demonstrate VR’s scope—you can create a virtual reality in a true sense, meaning an alternate reality, or you can somehow augment what’s real.

Just as we have two classes of application we have two classes of technology—the “complete” models and the “augmented” models.  A complete VR model creates the entire visual experience for the user.  For that, it could mix generated graphics with a captured real-time (or stored) image.  The augmented models are designed to show something overlaid on a real visual field.  Google’s Glass was an example of augmented reality (Cardboard Plastic would have been a “lightly augmented” one too).  Complete VR can be applied to either the alternate reality or augmented reality applications, but the augmented approach is obviously targeted at supplementing what’s real.

The spoof notion of Cardboard Plastic is a kind of signal for where the notion of augmented reality would go, because it demonstrates that you probably don’t want to spend a lot of money and blow a lot of compute power in recreating exactly what the user would see if there was nothing in the way.  Better to show them reality through the device and then add on some projected graphics.  However, the technology to provide for “real-looking” projections and real see-through is difficult to master, particularly at price points that would be affordable.

The complete model is easier at one level and more difficult at the other.  It’s easy to recreate a visual framework on a camera; we do that all the time with phones and live displays on cameras.  The problem is the accuracy of the display—does it “look” real and provide sufficient detail to be useful.  We can approximate both fantasy virtual worlds and augmented reality with the complete VR models today, but the experience isn’t convincing.  In particular, the complete model of VR has to be able to deal with little head movements that the human eye/brain combination wouldn’t convert into a major visual shift, but that VR headsets tend to follow religiously.  Many people get dizzy, in fact.

In theory, in the long term, the difference between the two would shrink as graphics technology improves.  In the near term, the complete models are best seen as a window into a primarily virtual world and the augmented models a window into the real world.  The technical challenges of presenting a credible image of the real world needn’t be solved for augmented-reality devices, which is beneficial if the real world is the major visual focus of the applications.  For this piece, I need to use a different acronym for the two, so I’ll call anything that generates an augmented reality “AR” and the stuff that’s creating a virtual-world reality “VR”.

The applications of AR are the most compelling from an overall impact perspective.  I covered some when Google’s Glass came out, for consumers they include putting visual tags on things they’re passing, heads-up driving displays, or just social horseplay.  For workers, having a schematic of something they’re viewing in real time, displaying the steps that need to be taken in a manual task, warning them of interfering or incorrect conditions, are all good examples of valuable applications.

One thing that should be clear in all these applications is that we’re talking about mobile/wearable technology here, which means that the value of AR/VR outside pure fantasy world entertainment is going to depend on contextual processing of the stimulus that impacts their wearer.  You can’t augment reality for a user if you don’t know what reality is.

There are two levels to augmenting reality, two layers of context.  One is what surrounds the user, what the user might be seeing or interacting with.  Think of this as a set of “information fields” that are emitted by things (yes, including IoT “things”).  Included are the geographic context of the user, the social context (who/what might be physically nearby or socially connected), and the “retail” context representing things that might be offered to the user.  The second level is the user’s attention, which means what the user is looking at.  You can’t provide any useful form of AR without reading the location/focus of the user’s eyes.  Fortunately, that technology has existed in high-end cameras for a long time.

AR would demand that you position augmented elements in the visual field at the point where the real element they represented was seen.  However, if you were to move your eyes away from a real element that should probably signal a loss of interest, which should then result in dimming or removing the augmentation elements associated with it.  Otherwise you clutter up the visual field with augmentations and you can’t see the real world any longer.

As I said earlier here, and in prior blogs on AR/VR, there is tremendous potential for the space, but you can’t realize it by focusing on the device alone.  You have to be able to frame AR in a real context or it’s just gaming, whatever technology you use.  The second of our two layers of context could be addressed in the device but not the first.

At its best, AR could be a driver for contextual behavior support, which I’ve also talked about before.  Those “fields” that are emitted by various “things” could, if organized and cataloged, serve to tell an application what a user is seeing given the orientation and focus of their VR headset.  If you have this kind of input you can augment reality; if not then you’re really not moving the ball much and you’re limiting the utility and impact of your implementation.

This frames the challenge for useful augmented reality, which includes all those business apps.  The failure of the initial Google Glass model shows, I think, that we can’t have AR without the supporting “thing fields”.  We have to get them either because AR capability pulls them through or they arise because of IoT and contextual services, and I think the latter model is the most realistic because the cost of extensive deployment of information-field resources would be too high for an emerging opportunity like AR to pull through.  Google Glass showed that too.

What this means is that meaningful AR/VR will happen only if we get a realistic model for IoT that can combine with contextual user-agent services to create the framework.  That makes the IoT/context combination even more critical.