What We Can Learn from Verizon’s SDN/NFV Paper

Verizon has just released a white paper on its SDN/NFV strategy, developed with the help of a number of major vendors, and the paper exposes a number of interesting insights into Tier One next-gen network planning.  Some are more detailed discussions of things Verizon has revealed in the past, and some new and interesting.  This document is over 200 pages long and far too complicated to analyze here, so I’m going to focus on the high-level stuff, and along the way make some comments on the approach.  Obviously Verizon can build their network the way they want; I’m only suggesting places where I think they might change their minds later on.

The key point of the paper, I think, is that Verizon is targeting improvements in operations efficiency and service agility, which means that they’ve moved decisively away from the view that either SDN or NFV are primarily ways of saving money on infrastructure.  This is completely consistent with what operators globally have told me for the last several years; capex simply won’t deliver the benefits needed to transform the business.  And, may I add, business transformation is an explicit Verizon goal, and they target it with both SDN and NFV.

On the SDN side, Verizon is particularly focused on the independence of the control and data planes (“media planes”, as Verizon puts it, reflecting the increased focus on video delivery).  This is interesting because it validates the purist SDN-controller-and-OpenFlow model of SDN over models that leverage software control of current switches and routers.  They also say that they are expecting to use white-box products for the switches/routers in their network, but note here that “white box” means low-feature commodity products and not necessarily products from startups or from non-incumbent network vendors.  It would be up to those vendors to decide if they wanted to get into the switch game with Verizon at the expense of putting their legacy product revenue streams at risk.

On the NFV side, things are a bit less explicit at the high level.  Verizon recognizes the basic mission of NFV as that of decoupling functional software from specific appliances to allow its hosting on server pools.

One of the reasons why the NFV mission is lightweight at the high level, I think, is that Verizon includes an End to End Orchestration layer in its architecture, sitting above both NFV MANO and the SDN controller(s).  This layer is also responsible for coordinating the behavior of legacy network elements that make up parts of the service, and it demonstrates how critical it is to support current technology and even new deployments of legacy technology in SDN/NFV evolution.

Verizon also makes an interesting point regarding SDN and NFV, in the orchestration context.  NFV, they point out, is responsible for deploying virtual functions without knowing what they do—the VNFs appear as equivalent to physical devices in their vision.  Their WAN SDN Controller, in contrast, knows that a function does but doesn’t know whether it’s virtualized or not.  SDN controllers then control both virtual and physical forms of “white box” switches.

One reason for this approach is that Verizon wants an architecture that evolves to SDN/NFV based on benefits, often service-specific missions.  That relates well with the point I made in yesterday’s blog about operators looking more at SDN or NFV as a service solution than as an abstract architecture goal.  All of this magnifies the role of the service model, which in Verizon’s architecture is explicitly introduced in three places.  First, as how an OSS/BSS sees the service, which presumably is a retail-level view.  Second, as the way of describing resource-behavior-driven cooperative service elements, and third (in the form of what Verizon calls “device models”) as an abstraction for the functional units that can be mapped either to VNFs or physical network functions (PNFs).  End-to-End Orchestration (EEO) then manages the connection of models and doesn’t have to worry about the details of how each model is realized.  This is a firm vote in favor of multi-layer, divided, orchestration.

Management in the VNF sense is accommodated in the Service Assurance function, which Verizon says “collects alarm and monitoring data. Applications within SA or interfacing with SA can then use this data for fault correlation, root cause analysis, service impact analysis, SLA management, security monitoring and analytics, etc.”  This appears to be a vote for a repository model for management.  However, they don’t seem to include the EMS data for legacy elements in the repository, which I think reflects their view that how a function is realized (and thus how it is managed) is abstracted in their model.

I do have concerns about these last two points.  I think that a unified modeling approach to services is both possible and advantageous, and I think that all management information should be collected in the same way to facilitate a unified model and consistent service automation.  It may be that Verizon is recognizing that no such unified model has emerged in the space, and thus are simply accommodating the inevitability of multiple implementations.

An interesting feature of the architecture on the SDN side is the fact that Verizon has three separate SDN controller domains—access, WAN, and data center.  This, I think, is also an accommodation to the state of SDN (and NFV) progress, because a truly powerful SDN domain concept (and a related one for NFV) would support any arbitrary hierarchy of control and orchestration.  Verizon seems to be laying out its basic needs to help limit the scope of integration needed.  EEO is then responsible for harmonizing the behavior of all the domains involved in a service—including SDN, NFV, and legacy devices.

Another area where I have some concerns is in the infrastructure and virtualization piece of the architecture.  I couldn’t find an explicit statement that the architecture would support multiple infrastructure managers other than that both virtual and physical infrastructure managers are required.  But does this multiplicity also extend within each category?  If not, then it may be difficult to accommodate multi-vendor solutions given that we already have proprietary management in the physical network device sense, and that the ETSI specs aren’t detailed enough to insure that a single VIM could manage anyone’s infrastructure.

My management questions continue in the VNF Manager space.  Verizon’s statement is that “Most of the VNF Manager functions are assumed to be generic common functions applicable to any type of VNF. However, the NFV-MANO architectural framework needs to also support cases where VNF instances need specific functionality for their lifecycle management, and such functionality may be specified in the VNF Package.”  This allows an arbitrary split model of VNF management, particularly given that there are no specifications for how “generic functions” are defined or how VNF providers can support them.  It would seem that vendors could easily spin most management into something VNF-specific, which could then complicate integration and interchangeability goals.

EEO is the critical element of the architecture overall.  According to the document, “The objective of EEO is to realize zero-touch provisioning: a service instantiation request — from the operations crew or from the customer, through a self-service portal – results in an automatically executed work flow that triggers VNF instantiation, connectivity establishment and service activation.”  This appears to define a model where functions are assembled to create a service, and then lifecycle management for each function is expected to keep the service in an operating state.  However, Service Assurance interfaces with EEO to respond to SLA failures, which seems to create the potential for multi-level responses to problems that would then have to be organized through fault correlation or response analysis.  All of that could be handled through policy definition and distribution, which Verizon’s architecture also requires.

The final interesting element in the Verizon paper is its statement on Intent-Based Networking (IBN), which is their way of talking about intent modeling and the ONF initiatives in that direction.  The paper makes it clear that Verizon sees IBN as a future approach to “populized” network control rather than as a specific principle of the current architecture, but on the other hand their models (already referenced above) seem to apply intent-based principles throughout their architecture.  It may be that Verizon is using the term “IBN” to refer only to the evolving ONF initiatives, and that it expects to use intent principles in all its model layers.

The most important thing that comes out of the Verizon document, in my view, is that neither the current ONF nor NFV ISG work is sufficient to define an architecture for the deployment of even SDN and NFV (respectively) much less for the deployment of a service.  Integration testing, product assessment, and even SDN/NFV transformation planning need to look a lot further afield to be useful, and that’s going to involve in many cases making up rules rather than identifying standards.  This means, IMHO, that Verizon is not only willing but determined to move beyond the current processes and make their own way.

For players like Ericsson, this could be good news because if every operator follows the Verizon lead and defines their own next-gen architecture, there will be considerable integration work created.  That might diminish in the long term if standards bodies and open-source initiatives start to harmonize the implementation of SDN and NFV and incorporate the required higher-level concepts.

The six vendors I’ve identified as being capable of supporting a complete NFV business case could also learn something from Verizon’s paper.  One clear lesson is that a failure to develop a total benefit picture in positioning, which I think all six vendors have been guilty of, has already exacted a price.  I don’t think operators, including Verizon, would have gone so far in self-integration if they’d had a satisfactory strategy offered in productized form.  However, all six of the key NFV vendors can make the Verizon model work.  Who makes it work best?  I think Nokia wins that one.  I know one of the key Nokia contributors to the Verizon paper, and he’s both an IMS and federation expert.  And, no matter what Verizon may feel about vCPE, it is clear to me that their broad deployment of NFV will start with mobile/5G.

Overall, Verizon’s paper proves a point I discussed yesterday—vendors who want to succeed with NFV will need to have a strong service-based story that resonates with each operator’s market, and a broad architecture that covers all the bases from operations to legacy infrastructure.  Verizon has clearly taken some steps to open up the field to include many different vendors, but most operators will have to rely on a single vendor to at least get the process started, and everyone is going to want to be that vendor.  The bar has been high for that position from the first, and it’s getting higher every day.

What Buyers Think about NFV and the Cloud

I got back from a holiday to a flood of data from both enterprises and network operators/service providers—the former group focusing on cloud and network service views, and the latter group focusing on NFV.  Because all the data is so interesting I thought it was important to get it into a blog ASAP, so here we go!

Let’s start with my last group, the operators/providers.  The issue I was working on was the expectations for NFV in 2016, now that we’re a third of the day through the year.  I got responses from key people in the CIO, CFO, and CTO areas, and I had some surprises—perhaps even a few big ones.

The biggest surprise was that all three groups said they were more sure now that they would deploy some NFV infrastructure in 2016 than they had been at the start of the year.  Nearly five out of six operators responded that way, which is a pretty optimistic view of NFV progress.  What was particularly interesting was that the three key groups all responded about the same way, a sharp departure from last year when CFOs weren’t convinced there was any future in NFV at all.

The second-biggest surprise, perhaps related to the first, was that the amount of NFV infrastructure expected to be deployed was almost universally minimal.  None of the operators said they believed that NFV spending would reach 3% of capital spending even in 2017.  This suggests that operators weren’t rushing to NFV, but perhaps waving earnestly in its direction.

The reason for the limited commitment expected even in 2017 is complicated, and you could approach it in two ways—what makes up the commitment and what has impacted the planning.  Let’s take the first of these first.

There are two paths of NFV that are considered viable by all the key operator constituencies—NFV targeting business customers with premises-hosted virtual CPE, and NFV targeting mobile infrastructure.  The first of these opportunities is not seen as a driver of large-scale systemic NFV at all, but rather a way of addressing the needs of businesses better, particularly through accelerated order cycles, portals for service orders and changes, etc.  The second is seen as potentially ground-shaking in terms of impact, but NFV in mobile infrastructure is very complicated and risky, in operators’ eyes, and thus they expect to dabble a bit before making a major investment.

Add these together and you can see what operators are seeing in 2016 and 2017.  vCPE is going to happen, but nobody thinks that it’s going to shake their spending plans except perhaps MSPs who lease actual service from another operator and supplement it with CPE and management features.  Mobile applications could be very big indeed, but that very bigness means it’s going to happen at a carefully considered pace.

If all the “Cs” in the operator pantheon are in accord on these basic points, they still differ on the next one, which is the factors that got them to where they are.  The CIO and CFO organizations feel that a business case has been made for vCPE—enough said.  They also feel that there’s enough meat in mobile NFV to justify real exploration of the potential, including field trials.  But they do not believe that a broad NFV case has been made, and in fact these two groups believe that no such broad case for NFV can be made based on current technology.  The CTO, perhaps not surprisingly, thinks that NFV’s broad impact is clear and that it’s just a matter of realizing its potential.

Everyone is back in sync when it comes to what might realize NFV potential—open-source software.  In fact, the CIO and CFO are starting to think that their transformation is more generally about open-source use than about NFV in particular.  This, perhaps, is due to the fact that the CTO organizations have shifted their focus to open-source projects aiming at NFV software, from the standards process that’s still the titular basis for NFV.  Getting all the pieces of NFV turns out to involve getting a lot of related open-source stuff that can be applied to NFV, but also to other things.

Everyone has their own example of this.  The CIOs are starting to see the possibility of having OSS/BSS transformation based more on open-source tools—or at least those who are interested in OSS/BSS transformation.  It’s clear from the responses I got that CIOs are split on just how much the fundamentals of their software needs to change.  A slight majority still see their job as being simply the accommodating of SDN/NFV changes, but it seems more and more CIOs think a major change in operations software is needed.

There’s also more harmony of views than I’d expected on just how far NFV will go in the long run and what will drive it there.  Only about a quarter of CIO/CFO/CTO responses suggest that systemic infrastructure change will drive broad adoption of NFV.  The remainder say that it will be service-specific, and the service that has the most chance of driving broad deployment is mobile.  Those operators with large mobile service bases think they’ll be adopting NFV for many mobile missions, and this will then position more hosting infrastructure for other services to exploit.  Those without mobile infrastructure see NFV mostly as a business service agility tool, as I’ve said.

My impression from this is that operators have accepted a more limited mission for NFV, one that’s driven by specific service opportunities, over the notion of broad infrastructure transformation.  That doesn’t mean that we wouldn’t get to the optimum deployment of NFV, but it does suggest that getting there will be a matter of a longer cycle of evolution rather than an aggressive deployment decision.  The fact that the CIOs are not united on a course for OSS/BSS transformation seems the largest factor in this; you need those opex benefits to justify aggressive NFV roll-out.

Services that prove a broad benefit case—that’s the bottom line for NFV.

On the cloud side, enterprises’ views are related to media coverage of cloud adoption, but only just related.  Nearly all enterprises and about a third of mid-sized businesses report they use cloud services directly (not just hosting, but real cloud deployment of their apps).  Only one enterprise, a smaller one, said they had committed more than 15% of IT spending to the cloud, and none of the enterprises expected to shift more than 30%.  The cloud will be pervasive, but not decisive.

The problem seems to be that enterprises still visualize the sole benefit of cloud computing to be lower costs, in some sense at least, and that the cloud will host what they already run.  Given this picture, it’s not surprising that both Microsoft and IBM reported that while cloud sales have grown for them, it hasn’t replaced losses on the traditional IT side.  Users who adopt the cloud because it’s cheaper will always spend less.  The only way to get out of that box is to unlock new benefits with new cloud-specific techniques, but users have almost zero understanding of that potential.  Part of that is due to lack of vendor support for new productivity-benefit-driven cloud missions, and part to the fact that current enterprise IT management has grown up in a cost-driven planning age and can’t make the transition to productivity benefits easily.

There’s one common thread between the operator NFV and enterprise cloud stories, and that’s over-expectation.  Well over 90% of both groups say that they’ve had to work hard to counter expectations and perspectives that just didn’t jive with the real technology the market was presenting or the realistic benefits that technology could generate.  We live in an ad-sponsored age, where virtually everything you read is there because some seller has paid for it in some way.  That’s obviously going to promote over-positioning, and while the results aren’t fatal for either technology, I think it’s clear that NFV and the cloud would have progressed (and would continue to progress) further and faster if buyers could get a realistic view of what can be expected.

The Best Approach to SDN and NFV isn’t from ETSI or Open-Something, but From the MEF

I had a very interesting talk with the MEF and with their new CTO (Pascal Menezes), covering their “Third Network”, “Lifecycle Service Orchestration” and other things.  If you’ve read my stuff before, you know that there are many aspects of their strategy that I think are insightful, even compelling.  I’m even more sure about that after my call with them, and also more certain that they intend to exploit their approach fully.  I’m hoping that they can.

The Third Network notion comes because we have two networks today—the Internet which is an everybody-to-everybody open, best-efforts, fabric that mingles everything good and bad about technology or even society, and business connectivity which is more trustworthy, has an SLA, and supports an explicit and presumably trusted community.  One network will let me reach anything, do anything, in grand disorder. The other orders things and with that order comes a massive inertia that limits what I can do.

We live in a world where consumerism dominates more and more of technology, and where consumer marketing dominates even businesses.  An opportunity arises in moments, and is lost perhaps in days.  In the MEF vision (though not explicitly in their positioning) the over-the-top players (OTTs) have succeeded and threatened operators because the OTTs could use the everything-everywhere structure of the Internet to deliver service responses to new opportunities before operators could even schedule a meeting of all their stakeholders.

The operators can hardly abandon their networks, for all the obvious reasons.  They need to somehow adapt their network processes to something closer to market speed.  I think that the MEF concept of the Third Network reflects that goal nicely in a positioning sense.

At a technical level, it could look even better.  Suppose we take MEF slides at the high level as the model—the Third Network is an interconnection of provider facilities at Level 2 that creates a global fabric that rivals the Internet in reach without its connectivity promiscuity and its QoS disorder.  If you were to build services on the Third Network you could in theory have any arbitrary balance of Internet-like and Carrier-Ethernet-like properties and costs.  You could establish Network-as-a-Service (NaaS) in a meaningful sense.

In my view the obvious, logical, simple, and compelling architecture is to use the Third Network as the foundation for a set of overlay networks.  Call them Nicira-SDN-like, or call them tunnels, or virtual wires, or even SD-WANs.  Tunnels would create a service framework independent of the underlayment, which is important because we know that L2 connectivity isn’t agile or scalable on the level of the Internet.  The point is that these networks would use hosted nodal functionality combined with overlay tunnels to create any number of arbitrary connection networks on top of the Ethernet underlayment.  This model isn’t explicit in the MEF slides but their CTO says it’s their long-term goal.

A combination of overlay/underlay and an interconnected-metro model of the network of the future would be in my view incredibly insightful, and if it could be promoted effectively, it could be a revolution.  The MEF is the only body that seems to be articulating this model, and that makes them a player in next-gen infrastructure in itself.

What’s needed to make this happen?  The answer is two things, and two things only.  One is a public interconnection of Level 2 networks to create the underlayment.  The other is a place to host the nodal features needed to link the tunnels into virtual services.  We can host features at the user edge if needed, and we know how to do network-to-network interfaces (NNIs).  The operators could field both these things if they liked, but so could (and do, by the way) third parties like MSPs.

What would make this notion more valuable?  The answer is “the ability to provide distributed hosting for nodal functionality and other features”.  Thus, philosophically above our Third Network connection fabric would be a tightly coupled cloud fabric in which we could deploy whatever is needed to link tunnels into services and whatever features might be useful to supplement the basic connectivity models we can provide that way.  These, classically, are “LINE”, “LAN”, and “TREE”, which the MEF recognizes explicitly, as well as ACCESS and NNI.

If the Third Network is going to provide QoS, then it needs to support classes of service in the L2 underlayment, and be able to route tunnels for services onto the proper CoS.  If it’s going to provide security then it has to be sure that tunnels don’t interfere or cross-connect with each other, and that a node that establishes/connects tunnels doesn’t get hacked or doesn’t create interfering requests for resources.  All of that is well within the state of the art.  It also has to be able to support the deployment of nodes that can concentrate tunnel traffic internally to the network for efficiency, and also to host features beyond tunnel cross-connect if they’re useful.

You don’t need either SDN or NFV for this.  You can build this kind of structure today with today’s technology, probably at little incremental cost.  That to my view is the beauty of the Third Network.  If, over time, the needs of all those tunnels whizzing around and all that functionality hunkering down on hosting points can be met better with SDN or NFV, or cheaper with them, or both—then you justify an evolution.

What you do need in the near term is a means of orchestrating and managing the new overlay services.  Lifecycle Service Orchestration (LSO) is the MEF lifecycle process manager, but here I think they may have sunk themselves too far into the details.  Yes it is true that tunnels will have to be supported over legacy infrastructure (L2 in various forms, IP/MPLS in various forms), SDN, and NFV.  However, that should be only the bottom layer.  You need a mechanism for service-level orchestration because you’ve just created a service overlay independent of the real network.

The details of LSO are hard to pull from a slide deck, but it appears that it’s expected to act as a kind of overmind to the lower-level management and orchestration processes of NMS, SDN, and NFV.  If we presumed that there was a formal specification for the tunnel-management nodes that could be resident in the network (hosted in the cloud fabric) or distributed to endpoints (vCPE) then we could say this is a reasonable presentation of features.  The slides don’t show that, and in fact don’t show the specific elements for an overlay network—those tunnel-management nodes.

It all comes down to this, in my view.  If the MEF’s Third Network vision is that of an overlay network on top of a new global L2 infrastructure, then they need tunnel-management nodes and they need to orchestrate them at least as much as the stuff below (again, they assure me that this is coming).  You could simply let CoS do what’s needed, if you wanted minimalist work.  If they don’t define those tunnel-management nodes and don’t orchestrate them with LSO, then I think the whole Third Network thing starts to look like slideware.

The Third Network’s real state has special relevance in the seemingly endless battle over the business case for network evolution.  In my own view, the Third Network is a way of getting operators close to the model of future services that they need, without major fork-lift modernization or undue risk.  It could even be somewhat modular in terms of application to services and geographies.  Finally, it would potentially not only accommodate SDN and NFV but facilitate them—should it succeed.  If the Third Network fails, or succeeds only as a limited interconnect model, then operators will inevitably have to do something in the near term, and what they do might not lead as easily to SDN and NFV adoption.

This could be big, but as I’ve noted already the model isn’t really supported in detail by the MEF slideware, and in fact I had to have an email exchange with the CTO to get clarifications (particularly on the overlay model and overlay orchestration) to satisfy my requirement for written validation of claims.  He was happy to do that, and I think the MEF’s direction here is clear, but the current details are sparse because the long term is still a work in progress.

The MEF is working to re-invent itself, to find a mission for L2 and metro in an age that seems obsessed with virtualizing and orchestrating.  Forums compete just like vendors do, after all, and the results of some of this competition are fairly cynical.  I think that the MEF has responded to the media power of SDN and NFV, for example, by featuring those technologies in its Third Network, when the power of that approach is that it doesn’t rely on either, but could exploit both.  Their risk now lies in posturing too much and addressing too little, of slowing development of their critical and insightful overlay/underlay value proposition to blow kisses at technologies that are getting better ink.  There’s no time for that.

Whether the foundation of the Third Network was forum-competition opportunism or market-opportunity-realization is something we may never know, but frankly it would matter only if the outcome was questionable.  I’m more convinced than ever that the MEF is really on to something with the Third Network.  I hope they take it along the path they’ve indicated.

How to Get NFV On Track

You can certainly tell from the media coverage that progress on NFV isn’t living up to press expectations.  That’s not surprising on two fronts; first, press expectations border on an instant gratification fetish that nothing could live up to, and second that transformation of a three-trillion-dollar industry with average capital cycles of almost six years won’t happen overnight.  The interesting thing is that many operators were just as surprised as the press has been at the slow progress.  Knowing more about their perceptions might be a great step to getting NFV going, so let’s look at the views and the issues behind them.

In my recent exchanges with network operator CFO organizations, I found that almost 90% said that NFV was progressing more slowly than they had hoped.  That means that senior management in the operator space had really been committed to the idea that NFV could solve their declining profit-per-bit problems before the critical 2017 point when the figure falls low enough to compromise further investment.  They’re now concerned it won’t meet their goals.

Second point:  The same CFO organizations said that their perception was that NFV progress was slower now than in the first year NFV was launched (2013).  That means that senior management doesn’t think that NFV is moving as fast as it was, which means that as an activity it’s not closing the gap to achieving its goals.

Third point:  Even in the organizations that have been responsible for NFV development and testing, nearly three out of four say that progress has slowed and that they are less confident that “significant progress” is being made on building broad benefit case.

Final point: Operators are now betting more on open-source software and operator-driven projects than on standards and products from vendors.  Those CFO organizations said that they did not believe they would deploy NFV based on a vendor’s approach, but would instead be deploying a completely open solution.  How many?  One hundred percent.  The number was almost the same for the technologists who had driven the process.  Operators have a new horse to ride.

I’m obviously reporting a negative here, which many vendors (and some of my clients) won’t like.  Some people who read my blog have contacted me to ask why I’m “against” NFV, which I find ironic because I’ve been working to make it succeed for longer than the ETSI ISG even existed.  Further, I’ve always said (and I’ll say again here and now) that I firmly believe that a broad business case can be made for NFV deployment.  I’ve even named six vendors who can make it with their own product sets.  But you can’t fix a problem you refuse to acknowledge.  I want to fix it and so I want to acknowledge it.

The first problem was that the ETSI ISG process was an accommodation to regulatory barriers to operators working with each other to develop stuff.  I’ve run into this before; in one case operator legal departments literally said they’d close down an activity because it would be viewed as regulatory collusion as it was being run.  The collusion issue was fixed by absorption into another body (dominated by vendors) but the body never recovered its relevance.  That also happened with NFV, run inside ETSI and eventually dominated by vendors.

The second problem was that standards in a traditional sense are a poor way to define what has to be a software structure.  Software design principles are well-established; every business lives or dies on successful software after all.  These principles have to be applied by a software design process, populated by software architects.  That didn’t happen, and so we have what’s increasingly a detailed software design created indirectly and without any regard for what makes software open, agile, efficient, or even workable.

The third problem was that you can’t boil the ocean, and so early NFV work focused on two small issues—did the specific notion of deploying VNFs to create services work at the technical level, and could that be proved for a “case study”.  Technical viability should never have been questioned at all because we already had proof from commercial public cloud computing that it did work.  Case studies are helpful, but only if they represent a microcosm of the broad targets and goal sets involved in the business case.  There was never an attempt to define that broad business case, and so the case studies turned into proofs of concept that were totally service-specific.  No single service can drive infrastructure change on a broad scale.

All of this is what’s generated the seemingly ever-expanding number of “open” or “open-source” initiatives.  We have OPNFV, ONOS, OSM, OPEN-O, and operator initiatives like ECOMP from AT&T.  In addition, nearly all the vendors who have NFV solutions say their stuff is at least open, and some say it’s open-source.  The common thread here is that operators are demanding effective implementations, have lost faith that vendors will generate them on their own, and so are working through open-source to do what their legal departments wouldn’t let them do in a standards initiative.

The open-source approach is what should have been done from the first, because in theory it can be driven by software architecture and built to address the requirements first, in a top-down way.  However, software design doesn’t always proceed as it should, and so even this latest initiative could fail to deliver what’s needed.  What’s necessary to make that happen?  That’s our current question.

The goal now, for the operators and for vendors who want NFV to succeed, is to create an open model for NFV implementation and back that model with open-source implementations.  That model has to have these two specific elements:

  1. There must be an intent-model interface that identifies the relationship between the NFV MANO process and OSS/BSS/NMS, and another that defines the “Infrastructure Manager” relationship to MANO.
  2. There must be a Platform-as-a-Service (PaaS) API set that defines the “sandbox” in which all Virtual Network Functions (VNFs) run, and that provide linkage between VNFs and the rest of the NFV software.

There are three elements to NFV.  One is the infrastructure on which stuff is deployed and connected, and this is represented by an infrastructure manager (IM, in my terms, VIM for “virtual” infrastructure manager in the ETSI ISG specs).  One is the management and orchestration component itself, MANO, and one is the VNFs.  The goal is to standardize the functionality of these three things and to control the way they connect among themselves and to the outside.  This is critical in reducing integration issues and providing for open, multi-vendor, implementations.

We can’t simply collect the ETSI material into a set of specs to define my three elements and their connections; the details don’t exist in ETSI material.  This puts anything that’s firmly bound to the ETSI model at risk to being incomplete.  While an open-source implementation could expose and fix the problems, it’s not totally clear that any do (ONOS, CORD, and XOS among the open groups, or ECOMP for operators, seem most likely to be able to do what’s needed.

Vendors have to get behind this process too.  They can do so by accepting the componentization I’ve noted, and by supporting the intent models and PaaS, by simply aligning their own implementations that way.  Yes, it might end up being a pre-standards approach, but the kind of API-and-model structure I’ve noted can be transformed to a different API format without enormous difficulty—it’s done in software so often that there’s a process called an “Adapter Design Pattern” (and some slightly different but related ones too) to describe how it works.  The vendors, then, could adapt to conform to the standards that emerged from the open-source effort.  They could also still innovate in their own model if they wanted, providing they could prove the benefit and providing they still offered a standard approach.

This open approach isn’t essential in making the business case for NFV.  In some respects, it’s an impediment because it will take time for any consensus process to work out an overall architecture that fits (in general) my proposed model.  A single-vendor strategy could do that right now—six of them, in fact.  The problem is that vendors have lost the initiative now, and even if they got smart in their positioning it’s not clear that they could present a proprietary strategy that had compelling benefits.  They need an open model, a provable one.  That’s something that even those six might struggle a bit with; I don’t have enough detail on about half of the six to say for sure that they could open theirs up in a satisfactory way.  All of them will need some VNF PaaS tuning.

I think that it is totally within the capabilities of the various open-source organizations to solve the architecture-model problem and generate relevant specs and APIs, as well as reference implementations.  It is similarly well within vendor capabilities to adopt a general architecture to promote openness—like the one I’ve described here—and to promise to conform to specific standards and APIs as they are defined.  None of this would take very long, and if it were done by the end of the summer (totally feasible IMHO) then we’d remove nearly all the technical barriers to NFV deployment.  Since I think applying the new structure to the business side would also be easy, we’d quickly be able to prove a business case.

Which is why I think this impasse is so totally stupid.  How does this benefit anyone, other than perhaps a few vendors who believe that even if operators end up losing money on every service bit they carry they’ll sustain their spending or even grow it?  A small team of dedicated people could do everything needed here, and we have thousands in the industry supposedly working on it.  That makes no sense if people really want the problem solved.

My purpose here is to tell the truth as I see it, which is that we are threatening a very powerful and useful technology with extinction with no reason other than stubborn refusal to face reality.  NFV can work, and work well, if we’re determined to make that happen.

 

 

Is Ericsson’s NodePrime Deal Even Smarter Than it Looks?

Ericsson has made some pretty smart moves in the past, long before their smartness was obvious to the market.  They may have made another one with their acquisition of NodePrime, an hyperscale data center management company that could be a stepping stone for Ericsson to supremacy in a number of cloud-related markets, including of course IoT.

It seems the theory behind the deal is clear; if IoT or NFV or any other cloud-driven technology is going to succeed on a large scale, then data centers are going to explode.  Thus, dealing with that explosion would be critical in itself, but to make matters worse (or better, from Ericsson’s view) just getting to large-scale success will certainly require enormous efficiency in operationalizing the data center resources as they grow.  Hence, NodePrime.

Data centers don’t lack sources of operations statistics; there are a couple dozen primary statistics points in any given operating system and at least a half-dozen in middleware packages.  In total, one operator found they had 35 sources of data to be analyzed per server and 29 per virtual machine, and then of course there’s the network.  The basic NodePrime model is to acquire, timestamp, and organize all of this into a database that can then be used for problem and systems management and lifecycle management for installed applications.

Hyperscale data centers aren’t necessarily the result of SDN, NFV, or IoT.  While NodePrime positioning calls out that target, they also make it clear that they can manage data centers of any size, which means that they could probably both manage the individual distributed data centers operators are likely to deploy in SDN/NFV/IoT applications, but also the collective virtual data center (that’s a layer in NodePrime, in fact).  The NodePrime model also has three functional dimensions.  You can manage data centers this way.  You can manage service infrastructure this way, and feed the results into something like NFV management, and you could even build an IoT service by collecting sensor data like you collect server statistics.  I’m told that some operators have already looked at all three of these functional dimensions, and that NodePrime had said they support them all.

If we presumed that the management of an application or service was based on the analysis of the resource management data that was committed in support, then any complicated service could have insurmountable management problems.  If we presumed that a smartphone had to query a bunch of traffic sensors directly and analyze the trends and movements to figure a route, the problems are similarly insurmountable.  The fact is that any rational application based on using information has to be designed around an information analysis framework.

A framework has to do three things.  First, it has to gather information from many different sources using many different interfaces.  Second, it has to harmonize the data into a common model and timestamp and contextualize the information, and finally it has to support all the standard analytics and query tools to provide data views.  NodePrime does all of this.

The NodePrime model could represent a management abstraction very easily (datahub, in NodePrime).  The resources at the bottom are collected and correlated, passed into an analytics layer in the middle (directive) and used to create an abstraction of the resource framework that’s useful (virtual datacenter in the current model, but why not “virtual infrastructure?”)  This abstraction could then be mapped to service management, VNF management, and so forth.

It also works for IoT and contextual services.  Collect basic data at the bottom, use queries to generate abstractions that are helpful to users/applications, then expose these abstractions through microservices at the top.  NodePrime supports this too.

Well, sort of does it.  The meat of the value of NodePrime will come from the variety of information resources it can link and the range of analytics it can support.  For SDN, NFV, IoT, and other cloud applications and services, a lot of this is going to be developed by an integrator—like Ericsson.  Ericsson can enrich the capabilities of NodePrime through custom development and specialized professional services, which of course is what it likely wanted all along.

This isn’t the first time that a vendor has come at the notion of a network revolution driven by data centers and not networks.  Brocade had this message as the foundation of their NFV position in 2013 and gained a lot of traction with operators as a result.  They didn’t carry through with enough substance and they gradually lost influence.  Brocade has recently been making some acquisitions of its own, and one in the DevOps space that could arguably be an orthogonal shot at the data center space, because it’s targeting deployment and lifecycle management.

An inside-out vision of network evolution is logical, then, but it’s also a climb.  The further you are from the primary benefit case with a given technology, the longer it takes for you to build sales messaging that carries you through.  That’s been the problem with SDN and NFV, both of which in simplistic terms postulate a completely new infrastructure that would be cheaper to run and more agile.  How do you prove that without making a massive unsupported bet?

That’s where an Ericsson initiative to connect NodePrime decisively with IoT could be extravagantly valuable.  Industrial IoT isn’t really IoT at all, it’s simply an evolution of the established process control and M2M trends of the recent past.  However, the model that’s optimal for industrial IoT happens to be the only model that’s likely to be viable for “broad IoT”, and also a useful model for evolving services toward hosted components.  Ericsson could have a powerful impact with NodePrime.

The question with something like this is always “but will they?” of course.  There’s enough value in hyperscale or even normal data center management for cloud providers and operators to justify the buy without any specific mission for NFV, SDN, or IoT.  NodePrime was part of Ericsson’s Hyperscale Datacenter System 8000 before the deal was made.  However, the press release focuses on what Ericsson calls “software-defined infrastructure” in a way that seems to lead directly to NFV.

It’s not clear that Ericsson sees NodePrime’s potentially crucial role in IoT, or how success there might actually drive success with “software-defined infrastructure” by short-cutting the path to benefits and a business case.  NodePrime had some industrial IoT engagements before the Ericsson deal and Ericsson is surely aware of them, but there was no specific mention of IoT in the press release.  I had some comments from operators on the deal that suggested Ericsson/NodePrime had raised the vision with them at the sales level, but it’s not clear whether that was a feeler or the start of an initiative.

The qualifier “industrial IoT” used by NodePrime and some publications covering the story may simply reflect the fact that “industrial IoT” uses internal sensor networks and IoT fanatics aren’t prepared to let go of the notion of promiscuous Internet sensors quite yet.  We’ll have to see how this evolves.

 

A Service Plan for vCPE

The sweet spot for NFV so far has been virtual CPE (vCPE) and the sweet spot for vCPE has been managed services.  Nearly every operator out there has managed services ambitions at some level, but at least three out of four admit that they’re planning in a more hopeful-than-helpful sense.  Is there a right, or best, way to address the opportunity?  Yeah, there is, as you’ve no doubt guessed, and I’ve tried to assemble the guidance here.

To start with, managed service success is rooted in what could be called geo-demographics.  Prospects naturally sell themselves if they’re associated in a geographic sense, sales are easier if you have rich territories, and infrastructure and support are most efficient where customers can be concentrated rather than scattered over a wide area.  Most countries offer both consumer and business census information by locality, meaning zipcode or metro area at the least.  You start there for the geo part.

For demography, you have to look at the various MSP value propositions, which means looking at the service customer you want to turn into an MSP customer.  To start with, that’s an important point in itself because you don’t want to try to sell MSP services to somebody who isn’t connected already.  The selling cycle will be too long.  Ideally your prospect will have a service connection and have issues that can be addressed with a managed service.

Those issues are most likely to be present where there’s little or no network support expertise in-house, or perhaps even in the local labor pool.  If you want to sell a managed service you’re selling management of a service, which means you’re competing with any in-house personnel who happen to provide that already.  Since these very people are likely the ones to be assessing your offering, it can be a tough sell.  Most consumers lack network tech skill, and so do most SMBs.

There are a lot more consumers than SMBs, of course.  In the US there are about 6 million SMB sites and almost 30 times as many households.  However, business willingness to pay is much higher because the equipment and support needed to sustain business connectivity is much more expensive.  Census data that identifies the business population of a given area (I’ll use zipcode here as my example) is available, and most important that information is usually available by industrial classification (SIC, NAICS, etc.) within geography.

The reason you need the industry data is that the consumption of IT and network services varies radically by industry.  Those that consume the most on network services spend about six thousand times those that consume the least, in the US market.  What you’d like to find is an industry that spends a lot on network services and is well-represented by mid-sized businesses in your geography.

Another piece of industry data to look at is the rate of spending on integration services.  The leading industry in this category, in the US, spends about 15% of their IT budget on integration services where the last of the pack spends less than a half-percent.  If your market geography provides multi-year information you can also look at network spending and integration as a percent of IT spending growth over that period; fast growth usually puts stress on internal support.

When all else fails, look at the ratio of spending on personal computers to minicomputers and servers, meaning central IT.  Where there’s a large central IT structure there’s likely to be more technical support available, where companies with a bunch of distributed PCs often don’t have nearly the level of support the centralized gang do.

NFV vendors and analysts may like the idea of targeting specific virtual functions for your vCPE, but my data suggests that there are three broad sweet spots and that further refinement is difficult and often non-productive.  The best area is security, which includes virus scanning, encryption, and firewall services.  Second-best is facility monitoring, meaning the collection of sensor data and activation of control processes, and last is management of IT exercised through the network connection.

The limited experience available with vCPE marketing so far suggests that the best strategy is to present these three categories all at once rather than to focus on something, for the solid reason that a single application in a single area isn’t likely to generate broad buy-in on the prospect side and for such an application a device might be a viable option.  A good MSP salesperson would try to engage on at least two of the three categories, which means that pricing should favor that.  Pick a target with easy adoption in two areas and run with it.

Be very aware of the nature of the managed services you’re offering relative to the connection model of the user.  There are about 7.5 million business sites in the US, about 1.5 million of which are satellite sites of multi-site businesses.  This group is a great source of opportunity for VPN services, but try selling a VPN to a company with one location!  Also be aware that when you sell a multi-site business, you don’t sell the branches but the headquarters, so you need to check the business name online to see where the HQ is located.  If it’s out of your sales target area, forget it.

Once you have your prospects and your target service, think about fulfillment.  Your approach has to balance the cost of premises hosting on an agile appliance (vCPE) versus centralized hosting in one or more data centers.  The vCPE approach has convincing credentials as the best entry strategy because cost scales with customer base, and it’s going to retain that advantage for even fairly large deployments if the customers are widely distributed.  That’s because network connection and support costs for the tie-back of each user to a small number of data centers could quickly eradicate any economy of scale benefits.

Where this might not be true is where you have a very geographically confined customer set.  A small customer geography means that a small number of hosting points would serve customers without much hairpinning of traffic patterns to reach the hosting points.  This points out the infrastructure value of concentration; your sales types will tell you it also generates better reference accounts.  That’s particularly true if you have a limited industry focus; no reference is perceived to be as valuable as a firm of the same type in the same area.

There is a decent chance that if you can concentrate prospects you could migrate from pure vCPE to a hybrid of vCPE and cloud hosting, or completely to the cloud.  You’ll probably know whether this is even feasible based on the census data because that will give you a top-end estimate of what your prospect base could look like.

A final critical point to remember is that all my operator research shows that the critical question with new NFV services isn’t how well your implementation can do in reducing capex or offering “agility” but in how well it manages opex.  There is no practical reduction in capex that can justify a large-scale deployment of NFV absent stringent operations efficiency measures.  Even vCPE is vulnerable to opex issues, and vendors’ positioning of NFV consistently underplays opex impact.  Opex control is particularly important at the “S” end of SMB and in the consumer market, where price tolerance is so great and scale so necessarily large that even small operations issues are insurmountable.

Opex will also be critical for operators who have goals too lofty for vCPE and managed services to attain.  Funding a broader NFV and SDN base will be critical for those, and that funding cannot be secured through any realistic path other than opex efficiency.  In fact, in large NFV data centers, the operating costs per tenant could easily run three times the per-tenant capex.

My models say it is possible to make money on managed services, and in fact a decent piece of change, but it’s not just a matter of signing up a few VNF vendors and running an ad.  You’ll need a marketing campaign with proper geographic, demographic, and service targeting, and an implementation that can control operations impacts and save your profits.  All of the technical and benefit issues can be resolved using offerings from a half-dozen vendors.  The business issues are going to take leg work on your own.

Is There a Future in Augmented/Virtual Reality?

Last week there were a number of stories out on virtual reality (VR).  It’s not that the notion is new; gaming developers have tried to deliver on it for a decade or more, and Google’s Glass was an early VR-targeted product.  One interesting one was a joke.  On April 1st, Google spoofed the space with an offering it called “Cardboard Plastic”, a clear plastic visor thing that hid nothing, and did nothing.  It was a fun spoof, but that doesn’t mean that there’s nothing real about VR.  There are a dozen or more real products out there, with various capabilities.  I’m not going to focus on the design of these, but rather on the applications and impact.

From an application perspective, VR’s most common applications are gaming or presenting people with a visual field that includes their texts, which is a kind of light integration with “real” reality.  These combine to demonstrate VR’s scope—you can create a virtual reality in a true sense, meaning an alternate reality, or you can somehow augment what’s real.

Just as we have two classes of application we have two classes of technology—the “complete” models and the “augmented” models.  A complete VR model creates the entire visual experience for the user.  For that, it could mix generated graphics with a captured real-time (or stored) image.  The augmented models are designed to show something overlaid on a real visual field.  Google’s Glass was an example of augmented reality (Cardboard Plastic would have been a “lightly augmented” one too).  Complete VR can be applied to either the alternate reality or augmented reality applications, but the augmented approach is obviously targeted at supplementing what’s real.

The spoof notion of Cardboard Plastic is a kind of signal for where the notion of augmented reality would go, because it demonstrates that you probably don’t want to spend a lot of money and blow a lot of compute power in recreating exactly what the user would see if there was nothing in the way.  Better to show them reality through the device and then add on some projected graphics.  However, the technology to provide for “real-looking” projections and real see-through is difficult to master, particularly at price points that would be affordable.

The complete model is easier at one level and more difficult at the other.  It’s easy to recreate a visual framework on a camera; we do that all the time with phones and live displays on cameras.  The problem is the accuracy of the display—does it “look” real and provide sufficient detail to be useful.  We can approximate both fantasy virtual worlds and augmented reality with the complete VR models today, but the experience isn’t convincing.  In particular, the complete model of VR has to be able to deal with little head movements that the human eye/brain combination wouldn’t convert into a major visual shift, but that VR headsets tend to follow religiously.  Many people get dizzy, in fact.

In theory, in the long term, the difference between the two would shrink as graphics technology improves.  In the near term, the complete models are best seen as a window into a primarily virtual world and the augmented models a window into the real world.  The technical challenges of presenting a credible image of the real world needn’t be solved for augmented-reality devices, which is beneficial if the real world is the major visual focus of the applications.  For this piece, I need to use a different acronym for the two, so I’ll call anything that generates an augmented reality “AR” and the stuff that’s creating a virtual-world reality “VR”.

The applications of AR are the most compelling from an overall impact perspective.  I covered some when Google’s Glass came out, for consumers they include putting visual tags on things they’re passing, heads-up driving displays, or just social horseplay.  For workers, having a schematic of something they’re viewing in real time, displaying the steps that need to be taken in a manual task, warning them of interfering or incorrect conditions, are all good examples of valuable applications.

One thing that should be clear in all these applications is that we’re talking about mobile/wearable technology here, which means that the value of AR/VR outside pure fantasy world entertainment is going to depend on contextual processing of the stimulus that impacts their wearer.  You can’t augment reality for a user if you don’t know what reality is.

There are two levels to augmenting reality, two layers of context.  One is what surrounds the user, what the user might be seeing or interacting with.  Think of this as a set of “information fields” that are emitted by things (yes, including IoT “things”).  Included are the geographic context of the user, the social context (who/what might be physically nearby or socially connected), and the “retail” context representing things that might be offered to the user.  The second level is the user’s attention, which means what the user is looking at.  You can’t provide any useful form of AR without reading the location/focus of the user’s eyes.  Fortunately, that technology has existed in high-end cameras for a long time.

AR would demand that you position augmented elements in the visual field at the point where the real element they represented was seen.  However, if you were to move your eyes away from a real element that should probably signal a loss of interest, which should then result in dimming or removing the augmentation elements associated with it.  Otherwise you clutter up the visual field with augmentations and you can’t see the real world any longer.

As I said earlier here, and in prior blogs on AR/VR, there is tremendous potential for the space, but you can’t realize it by focusing on the device alone.  You have to be able to frame AR in a real context or it’s just gaming, whatever technology you use.  The second of our two layers of context could be addressed in the device but not the first.

At its best, AR could be a driver for contextual behavior support, which I’ve also talked about before.  Those “fields” that are emitted by various “things” could, if organized and cataloged, serve to tell an application what a user is seeing given the orientation and focus of their VR headset.  If you have this kind of input you can augment reality; if not then you’re really not moving the ball much and you’re limiting the utility and impact of your implementation.

This frames the challenge for useful augmented reality, which includes all those business apps.  The failure of the initial Google Glass model shows, I think, that we can’t have AR without the supporting “thing fields”.  We have to get them either because AR capability pulls them through or they arise because of IoT and contextual services, and I think the latter model is the most realistic because the cost of extensive deployment of information-field resources would be too high for an emerging opportunity like AR to pull through.  Google Glass showed that too.

What this means is that meaningful AR/VR will happen only if we get a realistic model for IoT that can combine with contextual user-agent services to create the framework.  That makes the IoT/context combination even more critical.

IBM’s Bluewolf Deal Says the Future is Less New Technologies than New Buyer Focus

Any acquisition by a major market player like IBM is news, and yesterday’s announcement that IBM was acquiring Bluewolf, a Salesforce professional services firm, could be big news.  According to the press release cited here, the deal will “Accelerate Cloud-based Customer Experiences for Salesforce Users,” and the obvious question is why IBM would want to do that—and just being nice is probably not the answer.

Salesforce is the leader in SaaS, a form of cloud service that has two very interesting characteristics.  First, it’s higher on the value chain than any other form of cloud computing.  SaaS users buy solutions, not platforms.  That means it displaces more internal cost than other cloud models (IaaS, PaaS) and can therefore tolerate higher prices and better margins for the seller.  Second, SaaS disintermediates internal IT, creating what the media has characterized as “shadow IT” that takes computing adoption right to line departments.  I think the combination of these two factors has to somehow explain IBM’s move, so let’s look at what the combination means.

First and I think foremost, is that the deal can help IBM with disintermediation problems.  IBM’s engagement model was wonderful in the past when IT organizations relied on IBM salespeople for technology strategy.  The problem it had was that as IT opportunity started to shift down-market, it was too expensive to adopt in the new battleground of SMB.  Salespeople simply couldn’t call on the buyers, nor could IBM afford the inevitable education the buyer needed in order to get anything done.  The line departmental buyer is in the same boat; IBM doesn’t call on them and there are too many to seduce and educate.

But line adopters of SaaS have come to realize that rolling your own IT without any support is a non-starter.  You create awful problems in security and compliance and often end up with something that doesn’t make your business case even if you avoid or remediate these problems.  Of all the forms of cloud technology, my data says that SaaS has the highest ratio of professional service costs to total project cost.  It generates an opportunity to get people to pay for something that they’d want for free if you told them you were selling a product like hardware or software.

But why buy a firm to get the skill?  It’s not like IBM doesn’t understand IT and applications.  I think the answer is that IBM believes that the “tactical” nature of cloud computing is going to generate a lot of interest in ad hoc IT, and that these interests will necessarily arise with the consumers of the service and not with the IT organizations.  Even today, enterprises tell me that the greatest pressure to accommodate mobility doesn’t come from IT organizations but from the organizations whose workers need mobile empowerment for productivity enhancement.  Thus, SaaS might be on the leading edge not only of a shift in buying power but a shift in technology focus.  IBM wants its mobile strategy to work, and mobile-empowerment buyers might well be SaaS buyers.

And that’s not all; we have what I think is the killer reason for a deal in SaaS professional services right now.  It’s the fact that SaaS does not now nor will it ever deploy in a vacuum.  There will be only a vanishingly small number of even mid-sized businesses who adopt a 100%-public-cloud model for IT.  Every cloud will be a hybrid cloud, which means that IBM has two levels of risk/opportunity exposure arising from the hybridization.  One is that big customers who have or are a prospect for IBM solutions are very likely to need to hybridize their Salesforce stuff and other future SaaS offerings with internal IT.  The other is that SaaS professional services could be a pathway to profitable engagement with the down-market segment that IBM’s traditional salesforce can’t afford to call on.  Done right, a professional services engagement on Salesforce could position what could almost be an IBM salesforce paid for by the customer, sitting on the customer site and engaged in IT strategy.

If we look at this from the top and unify the thinking, here’s what we get.  IT has to open new productivity avenues in order for IT budgets to rise—that’s what ROI is all about.  Those new productivity avenues are best understood by the organizations whose workers need to be empowered in new ways.  Those organizations, frustrated by the cost and performance of internal IT, are looking for ad hoc self-directed solutions to their IT issues, and SaaS in general and Salesforce in particular have emerged as the path to those solutions.  Because all of the real gains in IT spending are very likely to emerge from these SaaS explorations, and because these incremental changes will still have to be integrated with corporate IT and the applications it supports, all the good stuff in the future opportunity pool may be focused on the very space Salesforce professional services is already addressing.  You snooze, thinks IBM, and you lose.

This isn’t all that radical a viewpoint.  For most of the past, IT budgets have drawn on two different sets of funding, one to sustain what’s already committed in terms of an IT organization and applications, and a “project” budget to move IT into new areas.  Over time, the project budgets have shifted from empowerment goals to cost-management goals, and line departments have found themselves with little support for improvements in their own operation.

So will this work for IBM?  It might, but IBM is obviously not the only player who sees the light.  HPE, for example, has just announced its own market-supporting consulting strategy based on the industry specification called “IT4IT”, which links IT-user-centric enterprise architecture and business support processes with IT processes in a more direct and efficient way.  IT4IT could be a way of fostering improved engagement of IT organizations in the new productivity paradigms, which would not only give vendors like HPE a pathway to participate in these new benefit-creating processes, but also tend to keep internal IT in control.  And it doesn’t foreclose adding SaaS and even Salesforce to HPE’s targets of opportunity.

These moves suggest that IT is going to get a lot broader, that IT engagements that bring big, real, opportunities will have to be more aligned with business goals than ever before, and that IT organizations will have to earn a place in this evolution.  I suspect the same thing can be expected with networking, and that if there are “new service” opportunities out there, the opportunities are new not for their technology but for the target buyer.

Could Brocade’s StackStorm Deal Be the Start Of Something?

The acquisition of StackStorm by Brocade raises (again) the question of the best way to approach the automation of deployment and operations processes.  We’ve had a long evolution of software-process or lifecycle automation and I think this deal shows that we’re far from being at the end of it.  Most interesting, though, is the model StackStorm seems to support.  Might Brocade be onto a new approach to DevOps, to service lifecycle management, and to software automation of event processing with a broader potential than just servers and networks?

The end goal for all this stuff is what could be generalized as software-driven lifecycle management.  You deploy something and then keep it running using automated tools rather than humans.  As I’ve noted in past blogs, we’ve had a basic form of this for as long as we’ve had operating systems like UNIX or Linux, which used “scripts” or “batch files” to perform complex and repetitive system tasks.

As we’ve moved forward to virtualization and the cloud, some have abandoned the script model (also called “prescriptive”, “procedural” or “imperative” to mean “how it gets done”) in favor of an approach that describes the desired end-state.  This could be called the “declarative” or model-driven approach.  We have both today in DevOps (Chef and Puppet, respectively, are examples) and while it appears that the declarative model is winning out in the cloud and NFV world, we do have examples of both approaches here as well.

One thing that gets glossed over in all this declaring and prescribing is that software automation of any set of lifecycle processes is really about associating events in the lifecycle to actions to handle them.  This sort of thing would be an example of classic state/event programming where lifecycle processes are divided into states, within which each discrete event triggers an appropriate software element.  StackStorm is really mostly about this state/event stuff.

StackStorm doesn’t fall cleanly into either the declarative or prescriptive models, IMHO.  You focus on building sensors that generate triggers, and rules that bind these triggers to actions or workflows (sequences of actions).  There are also “packs” that represent packaged combinations of all of this that can be deployed and shared.  It would appear to me that you could use StackStorm to build what looked like either a declarative or prescriptive implementation of lifecycle management.  Or perhaps something that’s not really either one.

All of my own work on automating lifecycle processes has shown that the thing that’s really important is the notion of state/event handling.  A complex system like a cloud-hosted application or an SDN/NFV/legacy service would have to be visualized as a series of functional atoms linked in a specific way.  Each of these would then have their own state/event processing and each would have to be able to generate events to adjacent atoms to synchronize the lifecycle progression.  All of this appears to be possible using StackStorm.

In effect, StackStorm is a kind of PaaS for software automation of lifecycle processes.  As such, it has a lot of generalized features that would let you do a lot of different things, including addressing virtually any software-based feature deployment and utilizing scripts, tools, and other stuff that’s already out there.  The flip side of flexibility is always the potential for anarchy, and that means that StackStorm users will have to do a bit more to conceptualize their own software lifecycle management architecture than users of something like TOSCA (I think you could implement TOSCA through StackStorm, though).

As a pathway to automating cloud deployment, StackStorm could be huge.  As a mechanism for automating service lifecycle management in SDN, NFV, and even legacy services, StackStorm could be huge.  Played properly, it could represent a totally new and probably better approach to DevOps, ServOps, and pretty much everythingOps.  Of course, there’s that troublesome qualifier “properly played….”

I didn’t find a press release on the deal on Brocade’s site, but the one on StackStorm’s offered only this comment: “Under Brocade, the StackStorm technology will be extended to networking and new integrations will be developed for automation across IT domains such as storage, compute, and security.”  While this pretty much covers the waterfront in terms of the technologies that will be addressed, it doesn’t explicitly align the deal with the cloud, virtualization, SDN, NFV, IoT, etc.  Thus, there’s still a question of just how aggressive Brocade will be in realizing StackStorm’s potential.

The key to this working out for Brocade is the concept of “packs”, but these would have to be expanded a bit to allow them to define a multi-element model.  If that were done, then you could address all of the emerging opportunities with StackStorm.  What would be particularly helpful to Brocade would be if there were some packs associated with SDN and NFV, because Brocade is too reliant on partners or open operator activities to provide the higher-level elements to supplement Brocade’s own capabilities.

It would be interesting to visualize these packs as composable lifecycle management elements, things that could be associated with a service feature, a resource framework, and the binding between them.  If this were to happen, you’d need to have a kind of PackPaaS framework that could allow generalized linkage of the packs without a lot of customization.  This is something where a data framework could be helpful to provide a central way of establishing parameters, etc.

It would also be interesting to have a “data-model-compiler” that could take a data representation of a service element and use it to assemble the proper packs, and thus provide the complete lifecycle management framework.  This would certainly facilitate the implementation of something like TOSCA (and perhaps the ETSI NFV model with the VNF Descriptor) where there’s a data framework defined.

The last “interesting” thing is a question, which is whether something like this could effectively take the “Dev” out of “DevOps”.  Many of the tools used for deployment and lifecycle management were designed to support a developer-operations cooperation.  That’s not in the cards for packaged software products.  Is StackStorm agile enough to be able to accommodate packaged-software deployment?  Could “packs” represent packages to be deployed, perhaps even supplied by the software vendors?  Interesting thoughts.

We may have to wait a while to see whether Brocade develops StackStorm along these lines, or even to see what general direction they intend to take with it.  But it’s certainly one of the most interesting tech M&As in terms of what it could show for the buyer, and for the market.

How Did SDN/NFV Vendors Lose the Trust of their Buyers (And Can they Reclaim It?)

If you look at or listen to network operator presentations on next-gen networking, you’re struck by the sense that operators don’t trust vendors any more.  They don’t come out and say that, but all the discussions about “open” approaches and “lock-in” demonstrate a healthy disdain for their suppliers’ trustworthiness, and just the fact that major operators (like both Verizon and AT&T in the US) are driving their own busses in their network evolution speaks volumes.  What’s going on here, and are operators right in their view that vendors can’t be trusted to serve their interests?

My surveys have shown operator trust in vendors eroded a long time ago.  Ten years ago, operators in overwhelming numbers said that their vendors did not understand or support their transformation goals.  They were right, of course.  The challenge that the evolution of network services has posed for operators is mass consumption, where cost is paramount and where public policy and regulations have constrained the range of operator responses.  Vendors simply didn’t want to hear that the network of the future would have to cost a lot less per bit, and that stagnant revenue growth created by market saturation and competition could only result in stagnant capital budgets.  Cisco, under Chambers, was famous for their “traffic-is-coming-so-suck-it-up-and-buy-more-routers” story, but every vendor had their variant on that theme.

When NFV was first launched, I was surprised to see that even at one of the first meetings of the ETSI ISG, there was a very direct conflict between operators and vendors.  Tellingly, the conflict was over whether the operators could dictate their views in an ETSI group where ETSI rules gave vendors an equal voice.  Ultimately the vendors won that argument and the ISG went off in a direction that was far from what major operators wanted.

Vendor domination of the standards processes, generated by the fact that there are more vendors than operators and that vendors are willing to spend more to dominate than operators are, is the proximate cause of the current state of distrust of vendors.  Since large operators, the former regulated incumbents, are still carefully watched for signs of anti-trust collusion, operators themselves can’t form bodies to solve their collective problems, and open standards seemed to be the way out.  It didn’t work well, and so the next step was to try to drive open-source projects with the same goals.  That’s showing signs of issues too.

So far in this discussion, “vendors” means “network vendors” because operators’ concerns about intransigence driven by greed focused obviously on the vendors whose business cases were threatened by stagnant capital spending or a shift in technology.  In that same early ISG period, operators were telling me that there were three vendors they really wanted to get strong solutions from—Dell, HP (now HPE) and IBM.  Eventually they ended up with issues with all three of these new vendors too, not because they were obstructing the process but because they weren’t perceived by operators as fully supporting it.  Neither Dell nor IBM, IMHO, fields a complete transformation solution, and while HPE has such a solution they’ve not completely exploited their own capabilities.  Operators, in their view, had no vendors left and no viable paths to standardization or community development.  As a last resort you have to do your own job yourself.

If you look at the public comments of AT&T and Verizon as examples, operators are increasingly focusing on self-directed (though not always self-resourced) integration rather than on collective specifications or development.  They’re fearful even of what I’d personally see as successful open-source projects like ODL for SDN, but they’re willing to adopt even commercial products as long as they can frame their adoption in an open model that prevents lock-in.

Open models prevent lock-in.  Integration links elements into open models.  That’s the formula that’s emerging from early examples of operator-driven network evolution.  They’re willing to accept even proprietary stuff as an expedient path to deployment but it has to be in an open context, because down the line they intend to create a framework where vendor differentiation can never generate high profit margins for vendors.  Their own bits are undifferentiable, and so their bit production must also be based on commodity technology.  Open-source software, Open Compute Project servers, white-box switches—these are their building blocks.

So does this mean the End of Vendors as We Know Them?  Yes, in a sense, because it means the end of easy differentiation and the end of the notion of camel’s-nose selling, where you have something useful that pulls through a lot of chaff.  In a way that’s a good thing because this industry, despite tech’s reputation for being innovative, has let itself stagnate.  I was reading a story on the new Cisco organization, and the author clearly believed that one of Cisco’s great achievements was to use MPLS to undermine the incentive for thorough SDN transformation.  Not exactly innovation in action, which of course is both one of the operators’ current issues and the path to vendor salvation.

Innovation doesn’t mean just doing something different, it means doing something both different and differentiable in a value or utility sense.  If a vendor brought something to the operator transformation table that would truly change the game and improve operator profits, the operators would be happy to adopt it as long as it didn’t carry with it so much proprietary baggage that the net benefit would be zero (or less).

Ironically, the operators may be setting out to prove the truth of the old saw “We have met the enemy and they are us!”  Operator vision has been just as lacking as vendor vision, and in many ways lacking the same grounding in market realism versus greed.  IoT is the classic example of operator intransigence.  They promote a vision of IoT whose problems in an economic, security, and public policy sense are truly insurmountable when at least one alternative vision addresses all the problems.

We have IoT pioneers, ranging from credible full-spectrum giants like GE Global Research’s Predix activity to startup innovator Bright Wolf.  Network vendors want in on the game (one of Cisco’s new key groups is focused on IoT) and IT vendors like IBM and HPE have ingredients for the right Big Picture of IoT.  I think it may be that IoT will be the catalyst to educate both sides of the vendor/operator face-off.  It might also be an indicator that even the current SDN/NFV transformation initiatives of the operators will suffer serious damage from operators’ own shallow thinking.  If the transformed network doesn’t promote a vision for what’s likely to be the major new application of network services, it has little hope of helping operator profits.

Because some of the biggest drivers of change, like IoT, are yet to be “service-ized” as they must be, there’s still a chance for vendors to redeem themselves there.  The right answer is always valuable, especially if it can help you move on market opportunities faster.  But vendors don’t have to wait for these big opportunities; there is still plenty of time to offer a better approach than operators are devising on their own, and at a fair profit margin for vendors.  It will take some work, though, and I’m not sure at this point that vendors are willing to invest the effort needed.