Should Operators Combine the CTO and CIO Roles?

The TM Forum is a body that almost defines the love/hate concept. I’ve been associated with it for well over a decade, been a member twice and dropped it (or was dropped) twice. It’s done good things and it’s also been incredibly obtuse, difficult, and parochial. Now, according to a Fierce Telecom piece, it’s got another tick in the “love” column because I almost love what they’re suggesting. We’ll get to the suggestion and my “almost” qualifier below.

Operators have traditionally had three critical organizational divisions, each with a “Chief”. The main network body is run by the Chief Operations Officer, the Science and Technology piece by the CTO, and the OSS/BSS part by the CIO. In current practice, the COO and CTO tend to develop the main thrust of the network itself, choosing and deploying technologies. Standards membership is under the CTO. The CIO is almost a separate business, maintaining the software that forms the business side of the operator. There’s a healthy (or sometimes, to be frank, unhealthy) rivalry between the CIO and the rest of the organization.

What’s making the rivalry increasingly unhealthy is the fact that software-defined network features in any form mean shifting the network from a community of purpose-built devices (switches and routers) to one increasingly based on software-meaning-cloud-hosted features. In addition, the need to optimize both service agility and operations efficiency mandates a tight linkage between OSS/BSS software and the hosted elements of network services.

Up to now, the transformation to software-centric networking has been managed by the CTO organization in general, and more specifically by the part of that organization that hosts participation in formal standards groups. Neither the CTO nor the standards types have any real experience with software; nobody in a network operator other than the CIO has that experience. The NFV ISG has gotten off-track largely because of the lack of software-centricity among the operator representatives (and greed, perhaps, among vendor reps).

What the TMF wants to see is a merger of the CTO and CIO functions. That would bring an infusion of software knowledge to the CTO side, and break down a barrier to cooperation between service and network operations. I think this is a great step to take, even perhaps a critical one. I’ve seen the tension between CTO and CIO people in action for years, and it’s never been helpful. Having someone else tell them to play nice hasn’t worked, so combining the two functions seems the only step left.

So why is this an almost-love on my part? The reason is a paraphrase of the old adage “Two wrongs don’t make a right.” There is no question that the CTO organization is stuck in router-think, or at least box-think, and has no real conception of what a software-defined network would be. However, there’s no question that CIOs are stuck in OSS-think and have no notion of what a top-down, market-driven, world would look like.

Standards activities define the CTO and the organization overall. They’re relics of the supply-side thinking that regulated monopolies are almost forced to adopt. You build stuff and they come. There’s no need to figure out what “they” want; they’ll use what you supply because that’s all there is. If it takes you five years to get to where a commercially driven process would have gotten in six months, they’ll wait. There was a better, almost-optimal, model of NFV defined in mid-2013 and demonstrable by the end of the year, and it wasn’t adopted. The alternative that was is still unworkable, still in progress, entering 2019.

But the other side, the CIO side, has its own anchor, and ironically it’s the TMF. More than a decade ago, the TMF came up with what was literally the key and central element of the software-defined future, which was lifecycle orchestration of services by mediating the flow of service events to service processes using the service contract as a data model. “NGOSS Contract”, as it was called, was never widely implemented, in part because it was never effectively promoted.

Just this year, a long-standing thought leader within the TMF told me that the NGOSS Contract notion was getting new life. Well, it’s been over six months since I was told that. Where is it? Standards groups like the NFV ISG have taken way too long to come to a useful conclusion. You can make the same statement about the TMF with respect to things like the ground-breaking NGOSS Contract. Lengthy processes get behind the market, tax the ability of key people to contribute through the life of the initiative, and raise the cost of the work overall. Difficulty in participating is why I’ve not stayed with the TMF, and it’s also why I cut back on NFV ISG involvement. Only people paid to be standards-workers could possibly contribute the time needed.

What I’m saying is that neither the CTO nor the CIO processes of today have any positive record with respect to moving a good idea from inception to deployment at commercial speed. A public cloud giant like Amazon or Google would laugh at the processes of both bodies, and certainly their success in being market-driven can be said to derive from a determination to ignore the operation of CTO and CIO organizations alike.

That’s the source of my qualifier. Can a combination of a non-responsive CIO and CTO process lead to a “Chief Software Officer” or CSO that’s responsive? I give the TMF a lot of credit for coming up with the notion of a combined position, but just as defining the concept of the NGOSS Contract was a seminal leap forward that ended up being the only leap taken, the union of CTO and CFO could simply create one big insufficiency instead of two. The step beyond that’s needed is to direct the unified body at a valuable role and efficient model of filling it.

That model probably doesn’t include either formal standards bodies or industry groups like the TMF, unless the way those activities operate is radically changed. Open-source needs top-down guidance and it would be possible to frame either CTO or CIO processes to provide it, if we had industry forums that were designed the “modern software” way and not the old way—where “old” means both CTO-centric standards work or CIO TMF work.

It may well be that the future will be driven by and belong to a body who figures out how to be the driver of the software-centric future. Who’s up for the challenge here? We may find out shortly, because 2019 is a period when operators want to see some results. Here, the TMF may have an advantage because unlike the formal standards bodies, they’ve actually proven they can come up with something highly insightful. They just need to prove they can do something with the insight while the market need still exists.

Why vCPE and uCPE are the Wrong Approach

I’ve gotten a lot of recent questions from operators and others (even some on LinkedIn) about deploying service features (VNFs or otherwise) in carrier cloud or on “universal or virtual CPE”. This is a concern that dates back almost to the beginning of the NFV ISG work, and the tension between the two options has IMHO been a factor in diverting the efforts of the ISG to a less-than-useful path. Given these points, I think it would be useful to address the issue.

We should start by saying that service features have typically been created by inserting appliances into a service data stream. Firewall devices provided firewalls, in other words, and these appliances were called “physical network functions” or PNFs by the ISG. The principle goal of the NFV ISG was to define a model for hosting virtual network functions that substituted for PNFs. In the initial Call for Action white paper, the operators who launched the NFV work presumed that this hosting would take place on commercial off-the-shelf servers (COTS).

One of the service agility benefits quickly proposed within the ISG was the creation of multi-part services by the chaining of VNFs, and this gave rise to the “service chaining” interest of the ISG. A virtual device representing a service demarcation might thus have a VPN VNF, a firewall VNF, and so forth. Recently, SD-WAN features have been proposed via an SD-WAN VNF in the chain. All of this got framed in the context of “virtual CPE” or vCPE.

As a practical matter, though, you can’t fully virtualize a service demarcation; something has to provide the carrier-to-user connection, harmonize practical local network interfaces (like Ethernet) with a carrier service interface, and provide a point of management handoff where SLA enforcement can be monitored by both sides. Think of home gateways as an example; you need to have something that provides WiFi to the home, and that something has to be in the home. That’s true with at least some pieces of any service demarcation, and that almost immediately gave rise to the proposal that vCPE include both cloud hosting and hosting on a generalized premises device, something now often called “universal CPE” or uCPE.

Could you deploy a service chain of functions (VNFs) into a uCPE box, as though it was an extension of carrier cloud and using the set of features and capabilities the ISG has devised (and is still working on)? Perhaps, but the better question would be “Should you?” There are in my view some compelling reasons not to do that.

The first reason is a kind of appeal for logical consistency. The goal of the ISG has been to create, via virtualization and hosting, the software equivalent of a set of PNFs that could then be managed using traditional EMS/NMS/SMS tools. If we have a uCPE box, why would that box need special treatment to be so managed? There are a number of different ways we could stack software elements in a piece of uCPE, and all of them would produce, in an abstract sense, exactly the same virtual box with the same management properties.

And all of them would be better and more logical, too. Service chaining is a network-connection function, something you do to link cloud-hosted elements in a chain so they’d function as a single linear flow. There is no need for a network inside a device, which means that if you want to do uCPE hosting of a service chain, you have to either devise a method of providing a software linkage and then support that as a deployment alternative to a network link in NFV, or you have to force uCPE to have an internal network to chain the pieces.

I think both these approaches would be silly to say the least, but the whole uCPE issue raises what’s perhaps the most compelling question, which is whether you’d do a service chain even in the cloud. Suppose I have three possible software VNFs that might be combined to create a single piece of vCPE. You might say we have “firewall”, “VPN”, and “SD-WAN”. There are perhaps a half-dozen commercially meaningful combinations of those features. Why not simply create a unified VNF for each of those combinations?

There is no reason to have separate VNFs, which demand separate hosting, unless you can scale or replace them individually and gain some value. In a serial flow, I don’t think you can point out what the value would be. Furthermore, reliability theory says that if you need three independent serial elements for something to work, the chances all three will work is lower than the chances that a single element with all three feature sets would be working. And since separate hosting costs more, you’d be paying more for getting less. Service chaining never made any sense in a vCPE application, anywhere. To require it be supported and used for uCPE hosting of combination-VNF services would make even less sense (if that’s possible).

What I think this adds up to is that we do need the concept of elastic uCPE, elastic in the sense that it supports the insertion or replacement of feature elements. However, does it have to be implemented the way that “cloud-hosted service-chained VNFs” would be? No, nor is that idea itself even useful. What should be done, then?

The answer is that we need to go back to abstraction, the key to virtualization and the cloud. If we have the concept of virtual-device and if we assign properties to things of that class, which include the property of recognizing feature-slots into which we can insert feature-components, then uCPE is defined as it needs to be. The “object” uCPE represents must then provide for the insertion of components into slots, but exactly how that happens is inside the black box of abstraction and nobody knows or should care.

We’ve had the software support for this underlying principle of abstraction for decades. I recall the introduction of the concept in programming with the Modula-2 language back in the early 1980s, in fact. You would first define an abstract component with its external properties, then the implementation of that component. This same approach was carried into Java, the most popular programming language today. It’s the way software-think has gone for a long time, and so it’s clear that the current vCPE/uCPE approach is a regression into device-think. That’s surely off-target when we’re supposed to be at the dawn of the software-defined age in networking.
Many vendors and operators know all of this; some have been more outspoken than others, but my own contact with both groups shows the awareness of the problem (and the solution) pretty clearly. The challenge is that the commercial benefits for doing the right thing aren’t clear, and so vendors are understandably not eager to poison the market when operator spending is already threatened. Future transformation and future spending depend on somebody eventually driving this, and it may have to be the operators themselves.

Optical Grooming and the Simplification of IP

The second of the three 2019 trends I blogged about a couple weeks ago is the subduction of IP network features. IP, largely because of the Internet, has become the “service protocol of choice” globally, meaning that applications and devices are built to presume IP connectivity. The IP model has a lot of issues associated with it, and many vendors and organizations have at various times proposed a different approach. In fact, in the ‘80s and earlier, we had a different approach; networks were built on protocol-independent transport. A more protocol-neutral approach to networking may now be emerging.

Fundamental to any new network model is the reality that IP is the service protocol of choice, and that any attempt to replace it is surely doomed to failure. Thus, a new network model has to focus on creating something different below the IP service layer. The OSI model, which is more recent than IP but still ancient by today’s standards, allowed for and even mandated multiple layers of protocol functionality, with each layer having a role of supporting the layers above. IP doesn’t really break neatly into OSI layers (as I said, it predates it), but if we could map some basic transport-like IP features downward to something else, we might emerge with a simple IP service layer and a better framework for security and service management.

Optical networking, typically based on Ethernet as an electrical protocol riding on fiber or on wavelengths (lambdas) within a fiber. This is Level/Layer 1 of the OSI model, the physical layer. Fiber is the most critical transport element; it’s the fabric on which virtually all wide-area networking is built, but fiber capacity improvements mean that few sites can justify dedicated fiber connections. Residential fiber (fiber to the home or FTTH) is based on passive optical network sharing, and even that doesn’t lower costs enough to enable direct fiber connection of most homes, or even most small business sites.

Even where fiber is justified, additional subdivision of fiber capacity (called “grooming”) is essential if you want to support multiple higher-level connections on a single fiber. There have been a number of approaches offered to fiber grooming, ranging from some form of Ethernet (the preferred approach, obviously, of the MEF) to SDN. Since most fiber uses an Ethernet overlay, Ethernet is a logical choice, and if all we wanted to do was to subdivide fiber into electrically generated subchannels, that would likely be the choice.

What complicates that model is the fact that if you’re going to do electrical-layer handling it would be nice to be able to cross-connect the electrical tunnels to create end-to-end pipes. In effect, this creates a virtual physical layer that can be much more connective than optics, and in addition that layer can provide routing and rerouting. That means that the IP service layer can reduce or eliminate its own features in that area. The electrical layer subnetworks also can separate IP networks to improve security.

Ciena is a logical kingpin player in this expanded optical/electrical hybrid space. They’ve been doing M&A to supplement their own optical products, including a vendor who supplies SDN and service automation and orchestration (Blue Planet) and one that adds in sophisticated device discovery and inventory (DonRiver) and custom skills in operations integration. They’ve quietly assembled the pieces they need to present a logical subnetwork to the IP service layer, and to operationalize that layer fully.

The problem Ciena has is typical of vendors these days. They have an axel and want someone else to conceptualize the car. The problem with the subduction of IP features is that it’s necessarily an ecosystemic shift. You add stuff to layers that never had it, and remove that same stuff from where it traditionally lived. Both change equally, and so you have to be prepared to present a vision of how the new balance of features would work overall. Ciena is not doing that.

Google, in Andromeda and related stuff, has demonstrated how you use SDN open connectivity as the core of an IP network. A lot of the stuff used in that transformation is open source and available to Ciena and everyone else. It’s not rocket science. Route determination and advertising in IP is managed by control protocols. If you spoof them at the boundary between “real” IP and an IP subset based on agile physical-layer technology, you can make the IP network see what it expects, however much of it is displaced. Google creates this boundary at the BGP level, but you could do it everywhere from there out to the default gateway.

Control protocols in IP are a big part of the whole picture of feature subduction. Most of the IP features not directly related to passing packets are supported via one or more control protocols. These combine to create management, topology discovery, and endpoint visibility capabilities in IP. If you decide to subduct an IP feature, you need to then intervene in the control protocol handling associated with that feature. If you can do that correctly, then where and how the feature is provided isn’t relevant and your subduction is feasible.

This same level of attention is important in the creation of the IP service overlay network, and the hottest topic in the WAN today is an example of where we should see it. SD-WAN creates an overlay network that merges the IP VPN created through traditional MPLS VPN services with an overlay VPN created by SD-WAN nodes. Ideally, SD-WAN products should “look” like a gateway router, even perhaps a default gateway, and ideally they should participate in control exchanges that make SD-WAN-connected users and resources true partners in the VPN, just as if they were directly on it. Perhaps some or most do, but you don’t hear anything about that in the documentation, as users and prospective users have pointed out to me.

A holistic vision for an SD-WAN service layer would likely create what might be called “IP lite”, a stripped-down version of IP that eliminated the features that were to be subducted down to the transport level. It would also have to define how the control protocols for the features involved would be spoofed/emulated or transformed at the service layer boundary, the user interface, or both. This is what some people think the MEF is up to with its SD-WAN work, but insiders tell me that they’re not focusing on that at all, but on standardizing the interface between an SD-WAN node and the IP network providing transport service. A similar transformation might then be provided for the Ethernet (“Third Network”) transport model.

Does Ciena have to worry about the electrical-layer players? After all, you can create a grooming layer from above as easily as from below, right? Perhaps not. Optical-layer technology is mandatory in any realistic scenario for metro transformation, and there’s not much white-box optics rearing its head. At the electrical layer, everything above a hundred gig is very likely to move to white-box, and perhaps to SDN, and so the major switch vendors are less likely to have an incentive to drive the market.

That’s Ciena’s real worry, because they’ve been anemic at driving the market themselves. Like many vendors who offer something essential, Ciena has been focused more on fulfillment than on sales/marketing, and as I noted earlier, they’ve followed the normal vendor pattern of kissing off the future in the name of performing in the next quarter. If they want to leverage the enormous benefit their optical position gives them, they’ll have to do better.

How Operators Might Influence the SD-WAN Space

SD-WAN has been evolving in a number of ways, but the most important of the lot is the way an enterprise buys it. There are three options now available—directly from someone (vendor or VAR) as a product to be installed by the buyer, from a managed service provider (MSP), and from a network operator. Of the three, the network operator channel is by far the fastest-growing and the most credible with prospective users, and that’s likely the primary driver behind SD-WAN evolution overall. What might that mean for SD-WAN overall?

Operators I’ve talked with are frankly confused/conflicted about SD-WAN. On the one hand, they hate the idea that something could displace MPLS business, which is profitable for them. On the other hand, many operators admit that MPLS isn’t all that profitable except in major sites (like secondary data centers or regional HQs) because of the pricing pressure on it. They also admit that the question may not be whether SD-WAN kicks MPLS out of some sites as whether it’s their own SD-WAN or someone else’s that does the kicking. Self-cannibalization is better than being eaten by a third party.

Perhaps the most attractive truth about SD-WAN is that it is much more likely to be sold as a managed service than an MPLS VPN. Operators have long hoped to improve managed service sales on the grounds that they could (with proper tools) create better management economies of scale than users, and that this could then let them make a nice profit on services while still offering users a net reduction in total cost of ownership. Some operators tell me their surveys suggest the MSP gains could be much more than any possible MPLS displacement losses. Others disagree.

The conflict within each operator is mirrored by confusion on how to sell the service. Do you put the SD-WAN arrow in your quiver and keep it ready in case a competitor starts casting covetous eyes at your MPLS sites? Do you step up to promote SD-WAN where the buyer doesn’t have MPLS now, risking their extending it to marginal MPLS sites too? Most important, is SD-WAN just a strategy for VPNs via the Internet, or is it something bigger, broader, and more compelling?

Many operators are actually in multiple camps on these issues. Some see SD-WAN purely defensive in one regard, and yet also see that MSP opportunities are greater if the SD-WAN solution has some real useful features that could add to the buyers’ business case. About a dozen operators have, or are planning to have, multiple SD-WAN strategies in the near term, all somewhat covert to prevent confusion, and unwrap the one that turns out best when the buyers’ needs are clear.

In the short term, operators are most comfortable focusing on SD-WAN as a managed service that’s designed to extend existing MPLS VPNs to sites where they’re not available or priced acceptably. This extension mission doesn’t involve a lot of complicated features, but it also directly collides with MSP offerings and missions. Some operators, even today, don’t think they can sustain such a limited SD-WAN service objective.

In the longer term, operators’ strategic planners think that it’s inevitable that SD-WAN will become a feature race. They point out that the basis for “vanilla” cost savings from MPLS displacement and management economies of scale are almost identical regardless of the specific SD-WAN solution. Nobody wants to be in a pure price competition, particularly players like the network operators who are already engaged in commodity service marketing. Adding feature value improves differentiation and also increases buyer benefits, which can then justify higher service prices and profits.

That, of course, raises the question of what those higher-value features should be. So far, most SD-WAN vendors offer little beyond the essential overlay network capability that SD-WAN depends on. A few have traffic management capabilities, can identify and optimize routes, and provide other connection-related features. Some now offer cloud-resident clients, and a very few (two, perhaps) have logical networking and network-as-a-service features that actually advance connection management and security considerably. However, my research shows that buyers are curiously unfamiliar with most of the SD-WAN features or the differences among vendors.

One reason for that lack of familiarity is that SD-WAN’s business case is easily made by MPLS VPN extension or displacement, and everyone in sales knows that the goal is to get the check without raising spurious issues. Operators in particular are not accustomed to feature differentiation, or even to selling and marketing in the traditional sense. Rather than make the market for feature-based SD-WAN, they’d prefer that vendors make it. Most, at this point at least, are a bit annoyed that the vendors have not been doing much of a job of conditioning the market for a feature-based SD-WAN future. One operator told me that three-quarters of the SD-WAN vendors don’t have any features (beyond the vanilla) to promote, and the rest don’t know anything about promotion.

Those same operators will inevitably be forced to extend SD-WAN beyond the basics, not the least because it offsets the risk to their MPLS VPN business. That may spell the defining truth of the evolution of SD-WAN. Most of the current incumbent SD-WAN vendors, ironically all of the current market leaders, are feature-deficient in the extreme and they’ll have to fix that quickly if the operators drive more attention to features. But all the vendors are going to have to decide how to promote SD-WAN at the feature level, and that’s harder than it seems.

The media hates nothing more than an attempt to re-launch something. It’s lousy for attracting readers and ads, after all. It also calls into focus the fact that the initial story on the technology was off-base, which makes it harder to push the next wave of hype. It takes some very clever PR work to get a feature story on SD-WAN out there, and good website collateral, follow-up material, and even training and sales initiatives at trade shows and events. In short it takes a program, and nobody really has that today.

There’s also the problem of what could be called “feature collision”. Can a vendor like Cisco, for example, push all the wonderful features of network-as-a-service, including many features designed for cloud symbiosis and security, when their sales of other technologies in those very space are what buoyed up their most recent quarter?

Why has security become such an issue today, when in the old days of IBM SNA and Dataphone Digital Services or other leased-line services it was a minimal problem? Answer: the old model was intrinsically secure. If SD-WAN made VPNs intrinsically secure, would that obviate the need for (and sales of) a whole bunch of security layers and tools? Probably it would at the least reduce that need/sales combination.

Feature collision doesn’t pose a threat for vendors who don’t have security features besides those provided in their SD-WAN product. That’s what makes Oracle’s deal to acquire Talari interesting. Talari doesn’t have a particularly feature-insightful offering, but perhaps Oracle would be less resistant to pushing the feature envelope than a network vendor like Cisco, with a security business to protect. VMware, who purchased Velocloud, might also be expected to enhance features to gain additional market traction.

A feature opportunity that seems immune from collision issues harks back to my comment on economies of scale. If you’re going to do SD-WAN as a service, you need to make best use of any resources you provide, but most elements of SD-WAN will be hosted with the customer. The big issue will be operations economies of scale, which means a lot of zero-touch automation relating to the deployment of SD-WAN and maintaining its service levels in conformance to the contract. SD-WAN vendors offer management tools, of course, but nobody yet is really focused on zero-touch automation.

Network operators, obviously, could expect to sweep SD-WAN under the umbrella of their general plans for service lifecycle or zero-touch automation…if they had them. Right now, operators lack any real progress toward a general solution (the latest version of ONAP is about to come along, but I don’t yet have the details on what they might have included). That raises the question of whether an SD-WAN vendor might add an automation strategy of their own, or even whether an MSP might come out with an effective approach. There are enough pathways to success that somebody is sure to take one of them.

All this is going to take time, though. The most recent deals for market entry into SD-WAN services (MSPs and operators) have tended to focus on the relatively feature-disadvantaged market leaders. This shows, in my view, that most prospective providers of SD-WAN-based services are interested only in the basic value propositions—lower connectivity charges using business Internet rather than MPLS, or productivity gains by extending VPNs to sites not suitable for MPLS connection. If network operators are unusually defensive about SD-WAN impact, and if they’re the fastest-growing channel for enterprises to get SD-WAN, then it’s going to take a lot of kumbaya moments among the operators’ executives to socialize a feature shift. But it only takes one major player to see the light and the rest will follow, or lose.

A Deep Dive into “Cloud Transformation”

Last week I blogged about three critical trends in networking and the cloud. My goal was to position these technologies in the framework of network operator evolution and transformation. I’d blogged a bit about each at the technical level prior to last week, but I want to go into the specific technologies and their detailed picture now. Some of my readers have suggested this could be helpful in addressing each (or some) of the points.

My first point was the transformation of the cloud, meaning the evolution in the way we see cloud computing being adopted and the resulting impact on cloud technology. The cloud, like everything else in tech, has been mercilessly overhyped since the notion came along. As a result, we’ve lost a clear picture of what it would be good for, and what technical features might be needed to optimize its utility. I think we’re seeing that now.

Early cloud conceptualization was based on “server consolidation” and “migration”. Enterprises were expected to move things to the cloud, either because those things were currently hosted on inefficient siloed servers or because overall cloud economies would be better than those in the data center. From the first, my surveys and modeling said this wasn’t going to have the impact being proposed; business case limitations meant only 24% of current hosting requirements could be shifted to the cloud.

In the real world, of course, you have a nice laboratory for the refinement of the applications of a technology. In the case of the cloud, what we saw develop was a combination of specialized applications, largely developed by startups in the social media space, and “front-end” layers to traditional business applications designed to optimize the dynamic nature of web and app-based GUIs and mobile workers and users.

Some of our new cloud-specific applications, like Twitter, really live in a kind of perpetual front-end space. Because it serves a broad and conversational community, there really aren’t “transactions” growing out of the application. Instead you have the need to create an elastic pool of resources that can be jiggled to address peaks and valleys of tweet activity. The unit processing of a tweet is minimal, but the collective demand for resources could be very large.

On the enterprise side, people started to recognize that users of applications (whether workers, customers, or partners/suppliers) tended to spend a lot more time thinking about what they wanted to do than was spent processing the resulting transaction. The former step, the think-time, often required pretty interfaces and fairly stock catalog-type information, which the user was expected to browse and consider. Then, at the end, the user might initiate a single transaction that took milliseconds to process. Think about an online retail position, or CRM, and you’ll get the picture.

What all of this did for the cloud was shift our conception of what lives there. If you visualize the cloud as a bunch of static application-to-server relationships, then the benefit of the cloud is the capital and management economies of scale a cloud provider might enjoy over the enterprise data center. Because server hosting economies of scale follow an Erlang curve, which plateaus as resources increase, it turns out that big enterprise data centers secure an economy good enough that the cloud difference won’t cover the profit margin the cloud provider needs.

Serverless, functions, microservices, and container hosting in the cloud are all responses to this reality, to a new vision of what the cloud has to be, which is a highly dynamic and scalable source of transient processing. This doesn’t diminish cloud opportunity; my original models of cloud opportunity said that this kind of application, which couldn’t be supported effectively in the data center, could double IT spending by addressing new productivity and revenue opportunities. That doubling would largely benefit the public cloud.

The thing that’s made the transformation of vision of the cloud mission difficult is that the new cloud-friendly (or cloud-specific) application model is totally different from what we’re used to. So different that development and operations people have little or no experience with it, and thus cannot feel confident in their ability to adopt it. Few today recall (as I must confess I do) the early days of IT, when nobody really understood computers and what they could do, and businesses often let IT organizations decide on their own how a particular application would be “automated”. We’re back in those days now with the cloud.

Who solved the problem of “automation” in the ‘60s and ‘70s? It was IBM. I worked, in that period, in constant contact with dedicated IBM site representatives, both sales and technical, and with specialists in things like computer-to-computer communications, a concept essential to distributed processing (and now the cloud) but then in its infancy. The IBM of that period stepped up and made computers an effective, often compelling, business tool. That’s what the IBM of the present needs to do with the evolved mission of the cloud.

We know, from leading-edge work being done largely in the open-source community, that cloud applications will consist of a three-layer structure, with server resources at the bottom, applications at the top, and an orchestration and scheduling layer in the middle. This structure is evolving today in the event/serverless space in the cloud, and also with containers and Kubernetes or Apache Mesos, DC/OS, and Marathon. That evolution isn’t complete, though, and it’s proceeding in different ways in each of the examples of early implementation that I’ve cited.

Orchestration really needs to be about two basic things—the automation of the application lifecycle, and the distribution of work. Classic container tools (Kubernetes and others) have tended to focus on deployment first, add in some capability to respond to issues with infrastructure or workloads, and then add in “load balancing”. Serverless computing has the ability to do both lifecycle automation and work distribution, and it can also be used to sequence steps that are implemented statelessly and thus have problems self-sequencing. Most applications aren’t really candidates for serverless, though, because of the combination of cost and latency. In particular, the sequencing or state management elements of orchestration for serverless cloud should be expanded to the cloud at large, because it would solve the problem of managing state when components are replaced or scaled.

From an implementation perspective, the big question about this new orchestration layer is the extent to which it is complete and also available as middleware. There are a number of implementations of orchestration that provide the models and decomposition, but don’t provide the mission-specific logic needed. Those can be added externally. For example, load balancing and work distribution are already available as adjuncts to Kubernetes. It’s unlikely that a total, general, solution to our orchestration layer can be provided, so what would be essential is an architecture that defines how those specialized pieces not provided could be integrated. That could ensure that we didn’t end up with silos for each mission.

These requirements sound a lot like what’s needed for orchestration in SDN, NFV, and zero-touch automation. They should be exactly alike, because the biggest mistake that the operator community and its vendors have made in those areas is to ignore the work done for the cloud at large. If IBM, Red Hat, VMware, or anyone else wants to gain the love and respect of the operator space, they should be focusing on a converged set of tools and not a bunch of special initiatives that, by their very specialization, will never gain enough attention in the open-source community to build and sustain utility.

We obviously need a more agile (and more singular) approach to all of this. Business applications need this orchestration to organize front-end work and pass it efficiently to the legacy transactional applications they depend on. IoT needs it for event management and state control. The fact is that everything in the cloud needs it, and so it should be the foundation of a cloud architecture.

Few really know much about this orchestration layer, despite the almost-dazzling breadth of application. “Orchestration” is for today and the future what “automation” was in IBM’s heyday. Red Hat has the potential to bring the knowledge of all of this to IBM, and IBM could use that knowledge to recapture the role it played in “automation” fifty years ago.

Is Blockchain Really Needed for Service Lifecycle Automation?

We continue to hear a lot about blockchain technology in networking, including a role for it in service lifecycle automation.  Some have suggested it’s a critical technology in the latter, and while I think it could play a very useful role, I think that the value of blockchain there (as is the cases in many areas) is secondary to an architecture that directly addresses the problem.

Blockchain is a distributed mechanism for establishing the trustworthiness of something.  Most people know it for the way it’s used in cryptocurrency, which is authenticating the ownership of a “cryptocoin” as a step toward accepting it from someone else.  But as recent market shifts in the cryptocurrency space have shown, blockchain doesn’t make the currency worth anything.  The application—in this case,the use of a cryptocoin as tender—is more than the blockchain.

Probably the best way to understand what blockchain could do in networking is to look at a broader topic where trust is an issue—electronic data interchange or EDI.  The purpose of EDI was to allow electronic transactions to substitute for commercial paper,things like invoices, payment advices, orders, and so forth.  I was involved for quite a while with the Data Interchange Standards Association (DISA), who formulated the EDI data models.  One classic example of EDI trust was that someone thought they were ordering 10 watermelons when the seller interpreted the order as 10 truckloadsof watermelons.

The early solution to trust in EDI was to use an EDI network that acted as a transaction intermediary, receiving our watermelon order from the buyer and forwarding it to the seller. The network maintained a copy of the transaction and so would be able to establish that the order was in fact for either 10 or 10 truckloads.  This process gave transactions the property of non-repudiation.

Blockchain could do the same thing without the need for that intermediary EDI network and its associated costs and security/compliance risks.  A blockchained order would be immutable (as long as you don’t believe the encryption could be broken) and in fact could carry with it all the associated back-and-forth banter as well as other steps like payment, shipment, and so forth.  Every step in the order/fulfillment process would be added to the chain, and the record would be authoritative.

Network services are commercial activities, and so the normal EDI-like blockchain mission could apply to them as easily as to the EDI needs of other verticals.  A service order could be combined with all the steps associated with processing and fulfillment and put inside a blockchain to make it authoritative, meaning trusted and enforceable under contract. Some have suggested that this means blockchain is a necessary condition for automation of the service lifecycle because it’s necessary to represent service orders.  Not so.

If you back to the days of EDI, you find something called the “bilateral agreement”, which is a contract between trading partners that describe how they exchange EDI data and establish trust.  As long as the parties have a means of determining that they have a true copy of something, they can set legal policies to make that something authoritative and non-reputable. In point of fact, most commercial exchanges today happen electronically and don’t even bother with highly formalized bilateral agreements.  Web transactions ask you to review and confirm, and that’s that.

The big issue I see, though, isn’t whether you need blockchained contracts but whether you have anything to process them with.  That problem exists in two dimensions—the service lifecycle automation architectural model and the question of overhead.

An authoritative service order doesn’t process itself.  Service lifecycle automation demands that the order be handled by software, at every stage in its processing and in the face of conditions that impact the utility of the service in such a way that remediation is required.  The biggest problem that we face in service lifecycle automation is that we don’t have a broadly accepted model to do that.

The TMF had an approach (NGOSS Contract) but the resulting specification was never broadly adopted. OASIS TOSCA has been diddled to support the event-handling needed for service lifecycle automation, but the spec itself didn’t include event processing.  I’m told the TMF is rethinking the whole NGOSS Contract thing, and that TOSCA is now adding event-handling to the specs.  These steps could at least give us some formal models to work from, but implementations still have to be developed.

One might think that were, say, a TOSCA-modeled service order created using blockchain, we could add a layer of value to the overall service lifecycle automation process by formalizing how we created trust.  Yes, but only if the result doesn’t blow the quality of handling of events, and it might.

Blockchain is a fairly high-overhead process, both in terms of the resources needed and the time required to make changes or decode the document.  That overhead could complicate event processing by delaying the event’s delivery to the appropriate process.  That might suggest that while blockchain could be used in the commercial exchange part of service order processing, it might be less appropriate when applied tot he “internal model” the operator uses as the basis for service lifecycle automation.

The overhead issue may also be decisive in applications of blockchain to internally generated events/transactions for the purpose of creating an audit trail.  Journaling of activity is a strong piece of any security/compliance strategy, providing the journal is actually authoritative. However, the event processing can’t be inhibited by journaling delay,which a blockchain implementation could do.

To me, this adds up to a simple point, which is that to the extent to which service provider contracts and other documents are equivalent to EDI’s commercial paper, they’d be potential targets for blockchain implementation just as would be the case for other verticals’ EDI.  Pan-provider services look like a good example of where blockchain could be applied, and so do services created by portals that effectively define a dynamic partnership between enterprises and operators.  There might well be frequent changes to services in those situations. Elsewhere, I’m doubtful.  We all know that blockchain-based EDI is far from pervasive (or even common) so it’s likely that the perceived benefits of such a shift are low in other industries.  I don’t see a major reason to think service providers would find differently.

What is possible, though, is that blockchain will eventually form the basis for“service paper EDI” in the networking industry. Keep in mind, though, that EDI is an interface in its historical applications, meaning that it’s a means of passing commercial paper and not a means of recording the data those papers might contain when the usage is purely internal.  That is surely going to be true with networking too, and blockchain won’t be used to encode databases or even internal events and transactions; there’s not enough value to overcome the issues of overhead.

Blockchain is not required for service lifecycle automation, or even particularly related to providing it.  It could be fit into the commercial-exchange part of a service workflow, as it could be fit into other substitutions of electronic transactions for commercial paper.  If we focus on realistic missions, we can avoid the now-epidemic problem of overhyping and misdirecting new technologies.  We can also help service lifecycle automation focus on what’s really important there—the model.

Three Issues Leading to 2019

We are now approaching a new year and a new budgetary cycle for just about everyone.  That makes it a good time to look ahead, and I want to start that now by looking at three examples of important future trends.  I’ll use a specific company/product to highlight each trend.

The first trend is the transformation of the cloud, and the company I’m using as a bellwether is IBM.  Cloud computing is probably the most misunderstood transformation in all of information technology, and one of those who apparently misunderstands is IBM, whose success at navigating past transformations in IT is legendary.  Somehow they’ve missed this one, but they’re not alone.

Everything isn’t going to the cloud, no matter what the hype has said, but the cloud is still going to touch nearly everything.  The important truth we’re now starting to see in regards to cloud computing is that cloud services will be used as a front-end adjunct to traditional IT, which will still run largely where it always has—in data centers.  The Internet and the mobile/smartphone revolution has utterly transformed information access and customer/partner relationships, but the core applications on which most businesses are based have been remarkably stable for years.  Yes, users want more processing capacity, and yes, they want to spend less money, but what they do with these core applications has stayed largely the same.

What this means is that business cloud computing is hybrid cloud, with the cloud part creating an agile multi-device relationship with both the companies’ employees and with the outside world.  New cloud stuff is tacked on to old data center stuff, and that’s where IBM has issues.  They’ve been the master of the mainframe data center for decades, but mainframes aren’t where front-end or cloud activity is anchored.  IBM doesn’t get that, and doesn’t have the strong strategic influence where the cloud is happening.

Acquiring Red Hat could solve many of IBM’s technical problems with hybrid cloud, but only if IBM understands that a hybrid cloud hybridizes with a high-density server/software data center, and that what Red Hat does is give them a position in that kind of data center, a position IBM once had and threw away by selling off most of their server assets.

At best, even with Red Hat, IBM is behind the curve.  They’re not the only one, though.  The fact is that most of the major IT firms, software and hardware, have the same blind spot.  What’s saved them from IBM’s continual revenue declines is a broad hardware/software base and the ability to sell both into the data center and into public cloud providers.  That saving grace works only as long as every one of these IT firms makes the same mistakes, though.  If one manages to get the future of the cloud right, and supports it the right way, they bury the others.

The second of my trend/company points is the subduction of IP network features, and the firm that best epitomizes this trend is Ciena.  What’s behind this trend is in a sense related to what’s behind my first trend.  The Internet and smartphones have utterly transformed how people get information and entertainment.  The “value” of phones and cellular or wireline services is really the value of what they connect you with.  Those somethings are naturally migrating under demand pressure, and they’re migrating increasingly toward the access edge.

It’s traditional to think of the Internet as a collection of sites, but in traffic terms it’s really more like a collection of caches.  We’ve cached content, particularly video, for ages, and the amount of caching going on is exploding as we move more to streaming IP in the delivery of video content and the associated ads.  With edge computing, we’ll likely see a radical increase in the amount of process caching we do.  IoT (if we ever get our act together) will depend on process caching.

From a traffic perspective, then, a network isn’t an Internet or a network of sites or users, it’s a network of caches.  There won’t be nearly as many caches as users or sites; a country might have anywhere from a couple dozen to perhaps a maximum of 500.  These caches, representing aggregated demand for something, can easily be connected using something less complex than massive-scale IP routers.  In fact, the best way to connect them is to simply elevate fiber transport features a bit, creating a packet-network overlay to optics that can offer connection grooming when a full optical pipe isn’t justified.  That’s what Ciena has announced and will be expanding.

If operator services are to get smarter, they need to have smarts, which means servers and software and process caches.  Increasingly, then, the metro networks of the world will be linking process caches more than users, and creating what’s really a big distributed cache/edge-computing complex.  “The network is the computer”, as Sun Microsystems used to say.  Ciena is ready for that.

My next trend is the 5G/FTTN hybrid replacing wireline broadband, and the firm that embodies this most is Verizon.  There are multiple reasons why this trend is important, perhaps critical, and the fact that this trend reinforces my other two trends is reason in itself to be following it next year.

One of the biggest challenges in all of networking is pass cost, the cost of being able to make a quick connection to a customer’s site from aggregate facilities you’ve already deployed in the area.  Cable companies in the US, who ran CATV cable along nearly every street during the heady time when cable TV was the profit engine for wireline connectivity, lucked out by deploying something that happened to be capable of supplying high-speed digital broadband, including Internet access.  For the telcos, whose twisted-pair in-the-ground infrastructure doesn’t have nearly the upside in capacity as CATV, things were looking ugly.  Fiber to the home (FTTH) has way more potential capacity than cable, but it has a pass cost that’s very high because you have to run it down every street you hope to connect customers to.

The 5G/FTTN hybrid works by putting a node somewhere in a neighborhood, then using 5G millimeter-wave radio to connect to the home.  In some areas, operators tell me they can cover a square mile or more with a single node, which could be anywhere from a hundred to well over five hundred homes or small businesses, and more apartments if you have areas where they’re concentrated.  The cost to prep such a neighborhood is the cost of one 5G/FTTN node, which operators say would be less than a fifth the pass cost of FTTH.  In fact, it wouldn’t be much more than the pass cost of cable broadband.

Streaming IP video is the automatic consequence of the 5G/FTTN hybrid, and so its deployment will forever shift the TV dynamic.  Verizon will take years for its new home broadband model to eclipse FiOS’s FTTH, but long before the new approach dominates, Verizon will have to decide whether it will try to stay in the business of live TV or partner (as it has with its early 5G/FTTN deployments) with another streaming TV provider.

Streaming TV and ad insertion on the fly will, of course, radically increase the need for edge caching and radically accelerate the network-of-caches model that reduces reliance on routing and increases the value of groomed optical networks.

It’s tempting to see “carrier cloud” lurking behind all of these trends in some way, but that’s muddling cause and effect.  Technology doesn’t drive markets, demand/opportunity does.  What technology can do is facilitate the addressing of opportunities in new ways.  Carrier cloud is a facilitation, primarily of changes in video and advertising.  Those changes will also promote process hosting, and they’re driven by the need to improve broadband speeds and economies.  That doesn’t mean that the broad topic of carrier cloud is irrelevant, only that we’re approaching it through a series of seemingly unrelated moves.  It may not be the most efficient way to do things, but it will get the job done.

The Paths to Service Provider Transformation

Way back in 2013 I attended a big transformation meeting at a Tier One operator.  One of the things that happened that I felt was both interesting and ironic was that the person sitting on my left was making a case for the modernization of OSS/BSS systems as the key to transformation.  The person sitting on my right was making a case for the elimination of legacy OSS/BSS completely.  Does this sound like consensus to you?  Further, that division of perspective is still active today.

How critical are OSS/BSS changes to operator transformation?  At one level, you could say that since operations and the business of service providers are framed by these systems, they’re absolutely critical.  At another level, the right one in my view, you’d say that there are definitely operations things that need to be done, but that OSS/BSS systems may not be where the changes need to be made.

The truth is that service lifecycle automation is the key to transformation.  Things like SDN and NFV require it or any reduction in capex would be swamped by operations cost increases.  Even without any new technology shifts, my models say that service lifecycle automation could save enough to virtually eliminate the risk of declining profit-per-bit.  I’d say that this view is widely held among operators, and that among senior people it’s probably almost an axiom.  It’s less clear just how to go about getting to service lifecycle automation, though.

There have always been three approaches to service lifecycle automation.  One approach is what could be called the autonomous infrastructure approach, where infrastructure cooperates in its own terms to create services.  IP networks could be made to do this with policy management, and it’s Cisco’s solution.  The second approach is the operations orchestration model, which says that the OSS/BSS system provides the critical lifecycle automation features via its own orchestration.  The third is the abstraction layer approach, which says that we put a new layer between infrastructure and OSS/BSS, and that new layer does the heavy lifting.

The problem with autonomous infrastructure is that making a “service” a byproduct of network behavior misses the current OSS/BSS processes completely and also tends to create vendor lock-in, since there are no agreed open strategies that operators accept.  The problem with operations orchestration is that it requires a major shift in the way OSS/BSS systems work, because new network technologies (as well as current ones) have to become visible at the OSS/BSS layer.

The abstraction layer has been my solution from the first.  The general goal of such a layer is to harmonize the presentation of services and service elements between what’s above and what’s below.  In the context of service lifecycle automation, the goal is to create abstractions that from above look like the logical, functional, elements of a service, and then to map them to the resources below.  The specifics of that mapping would be opaque, meaning that each abstraction would be an intent model.  This approach has benefits in both the upward and downward directions.

Looking up, meaning toward the OSS/BSS, the biggest benefit of the abstraction layer is that it can portray services as function collections, which is really what an OSS/BSS wants to be working with.  Network operations may need to know how a specific function like “firewall” or even “VPN” is implemented, but OSS/BSS systems and their associated operations personnel don’t need to know.  Abstraction, then, lets OSS/BSS systems manage services built through service lifecycle automation, without caring about the implementation of the capability.

Looking down, the biggest benefit is really one of immunization.  OSS/BSS systems are not structured to be event-driven, to provide orchestration capability at the resource level.  To give them that mission is to impose legacy OSS/BSS structures on new service lifecycle software elements, and even down to the network and element management tools.

There’s immunization down below, too.  Abstraction means that all implementations of a given logical element are equivalent, which means that a software provider, operator, hardware vendor, or whoever could create interchangeable function implementations easily.  You can have a variety of technologies within a functional abstraction, and even have internal orchestration tools there, and they’re invisible and equivalent from above.

If we go back to the old Telecommunications Management Network model (TMN), which defines management as a cake with an element management network management, and service management layer, the abstraction is the highest level.  OSS/BSS then deals with services, as it usually does today, and it’s not whipsawed by changes in technology.

To me, the point of all this is that not only are the OSS/BSS systems not the major challenge in transformation, a view that they are would indicate the operator involved isn’t on the right track to transformation, period.  You can’t achieve your goal by attacking the wrong problems.

The regular survey done by the TMF on transformation is, in my view, an indicator of these truths, and the survey itself shows the division in perspective that exists on transformation with the operators’ management team.  My own contact with operators, in a rough survey, suggests that most of the CTO organization thinks that SDN and NFV are what transformation is about, and since there are standards and tests/trials for both, they’re on their way to success.  The CIO organization, where OSS/BSS lives, has almost the opposite view, thinking that not enough attention is being paid to their side of the house, so progress is limited.  CFOs have never believed any of the current hype, nor have the COOs and CEOs, so they are still waiting for a convincing path forward.

Open-source interest among operators is a symptom of this waiting and hoping.  The operators hold a strong view that vendors are not only uninterested in transformation, they actively work against it.  Open source is their answer, but as I’ve said several times, open-source isn’t an application or application architecture, and you can have a lousy open-source project that’s no more value to operators than an obstructionist vendor offering.

We don’t even have the open-source teams lined up right, in fact.  The typical story on the current transformation race pits ONAP against OSM, and in fact the two are aimed at completely different places on the model of service lifecycle automation.  OSM is principally an implementation of NFV MANO, and ONAP purports to be the layer between network infrastructure and OSS/BSS.  The latter, then, is in the right place with the wrong architecture, and the former (IMHO) in the wrong place with the wrong architecture.  “Wrong architecture” is enough wrongness to make the other distinctions unimportant.

There are still reasons to update OSS/BSS architecture, but even there the question is whether to modernize or supplement.  The big questions relate to the way that OSS/BSS systems and personnel interact with service models.  Do we demand service composition, customer support, etc. be integrated with current OSS/BSS features, or should we think in terms of having a “composer” and a “viewer” function that is really independent of current logic, integrated with OSS/BSS via a portal?  The latter view is more closely aligned with what’s being done for business applications as they’re expanded to support more composable GUIs and portals for apps.

The net here is that abstraction approach may be broadly accepted, but implementing it is another matter.  We still lack a good model-driven way of defining services and linking events to service processes, though the TMF came up with that idea a decade ago.  Without such a model, we can’t define how events actually link to processes, which means we can’t define how service lifecycle automation would be integrated/implemented.  There are companies who have made progress here, but they’ve not managed to inspire confidence among operators.

I think 2019 might be the year when this changes.  Operators are under a lot of profit pressure, facing a need to define new services in a more agile manner.  Service lifecycle automation is critical to making that work, which is the good news.  The bad is that even if everyone buys into the right approach, it may take time to productize it in a way operators will accept.

Can IBM and Red Hat Work?

IBM is buying Red Hat.  A summary view I’ve already expressed to a reporter is that it’s about the only smart move open to IBM, but there’s a big question whether IBM knows why it’s smart and will exploit the deal properly.  IBM is literally a legend in IT, having held the position of “most strategic influence” in my surveys for longer than any other vendor (in fact, longer than all of them combined).  Red Hat is a legend in open-source, but often “legendary” is a term used to describe past greatness.  Is there really hope this is a good match, and if so, why?

IBM has seen over 20 quarters of steady revenue decline, which surely isn’t a good thing.  Part of the reason is that IBM has virtually exited the market that actually made it successful—selling computers.  Except for the mainframe line, IBM has turned into a software and professional services firm, and that’s obviously going to reduce revenues.  But Microsoft, who owes its existence to IBM’s PC, surpassed IBM in revenues without hardware, so it’s possible to have a growth IT business on software alone—if you have a broad base.

What’s IBM’s biggest problem?  The lack of that base.  When it shed so much of its computer business, it shed realistic access to the great majority of business IT buyers.  You can only do so much with the Fortune 500, and every account you lose has to be replaced from the broader market—a market IBM apparently was willing to give up on.  The obvious benefit of a Red Hat deal is that it brings IBM access to that broad market that they lost.

Red Hat has the base, but it built it on exploiting open-source software that is necessarily available from other sources.  When Red Hat started out, open source software was a refuge for geeks, and Red Hat effectively took open-source out of the Dungeons and Dragons world into the real business world.  The problem is that open-source is now mainstream; Red Hat’s success created its own problem when the “open-source-plus-support” model became mainstream itself.

What’s Red Hat’s biggest problem?  A revenue model that’s increasingly making its offerings look like commercial software rather than open source.  It’s hard to sell something others are giving away, and as open source becomes more comfortable to business, it’s only going to get harder.  If Red Hat wants to make money on free software, they have to add a very significant value.

It may be simplistic, but I think that to make the IBM/Red Hat deal gel, you’d need it to address the two companies’ biggest problems.  That raises two questions—is that possible, and will IBM be able to navigate the path to that happy outcome, given IBM’s apparent strategic missteps in recent years.  We can start to address these questions by looking at “popular wisdom” on the motivation of the move.  There are two specific value propositions given for the deal; do either of them deliver?

If you read either the financial or technical press, the whole deal is about the cloud.  By buying Red Hat, IBM immediately becomes a credible competitor to Amazon, Microsoft, and Google, so the stories say.  If that’s really what’s behind the deal, then IBM has a very hard road ahead, and neither company is likely to solve its own biggest problem.

Red Hat is hardly a cloud giant.  Yes, IBM could base a cloud strategy on Red Hat software, but would that be either wise or differentiating?  Red Hat does offer most of the open source software tools that drive the cloud, but as many skeptics of the deal have pointed out, open-source software is open, meaning that IBM could have gotten the software without buying Red Hat.  The good news is that IBM doesn’t need Red Hat to be a cloud giant, they need Red Hat to be a data center giant with a foot in the hybrid cloud age.

Cloud computing is a natural partner to data center computing when applications have to support a broad and geographically diverse base of users.  The cloud can provide critical front-end support for web-based and app-based users, insulating the data center from the details of the user interface.  For businesses, this is the critical hybrid cloud mission, and it’s a partnership between cloud services and data center technology.

Every significant business IT user will end up with hybrid cloud.  Hybrid cloud demands a cloud, but as I’ve just noted, it also demands a data center incumbency.  IBM has that for a very limited set of large accounts, and Red Hat has a very broad position in the data center.  If you look at the competitive dynamic between Amazon and Microsoft, it should be obvious that the former has the market share because they’ve won the startup and social network business but the latter has corporate and hybrid cloud momentum because they have premises IT products.  Now, with Red Hat, so does IBM, and so a hybrid cloud driver is at least possible.

Another story I hear in the media is that IBM is going to improve Red Hat’s telco position.  Gosh, to me that’s saying that two firms who have nothing much will have plenty when combined.  Red Hat clearly has aspirations for the service provider space, and so does IBM, but neither company has generated swoons of admiration from the network operator community overall, and telcos in particular.  The operators I’ve talked with don’t like IBM’s apparent disinterest in specific operator issues, and they don’t like Red Hat’s licensing/pricing model.  It’s hard to believe that IBM would somehow alleviate the concerns operators have for Red Hat’s pricing, and it’s hard to believe that IBM’s very limited insight into the operator market would be able to somehow energize Red Hat’s efforts in that space.

There are two possible avenues to success in the operator space; NFV and carrier cloud.  The former, IMHO, isn’t going to deliver decisive results, and in any case I think it would be difficult to reconcile with the Red Hat pricing model.  The latter would require some specific strategy for addressing the real drivers of carrier cloud, and neither Red Hat nor IBM seem to understand even what those drivers are, much less how to address them.

The telco/operator angle on the IBM deal for Red Hat is too tenuous to be believed, no matter how eager you may be.  That means that if classical wisdom is correct, only the first motivator—the cloud—makes any sense.  There’s no question that Red Hat could be IBM’s avenue to creating hybrid cloud success, possibly enough success to take a place alongside Amazon and Microsoft as the premier cloud vendors for real business services.

The problem with this approach is that Microsoft has a long lead in hybrid cloud, and Amazon is trying hard to do deals with other players (recently Cisco) to create an on-prem hosting capability for AWS services.  That’s not enough to create true hybrid cloud opportunity, but it demonstrates that the space is likely to get more crowded and competitive.  That’s not the kind of opportunity that creates a quick validation for M&A.

It’s my personal view that the cloud, or the operator market, aren’t the primary goals of the deal at all.  I think what this is about is creating a pathway to very broad engagement for IBM, a way for IBM to leverage what remains of its natural talents in gaining strategic influence.  Professional services are an IBM strength; could they be directed through Red Hat at a broader market?  Could they give Red Hat the kind of differentiation that open-source once did, but can’t do any longer?

That’s where the question of how much IBM really understands comes in.  I think that there could be enormous positive symbiosis here; Red Hat is indeed the only smart move left for IBM.  I think that very few, if any, IBM leaders know why that’s the case, which means it’s unlikely that IBM will be able to leverage Red Hat fully.

I hope I’m wrong.  I literally entered IT on IBM gear, and there’s no company I’ve had more continuous respect for…until recently.  IBM is off-track, and it’s critical for them to admit this, to themselves and to their customers, and use Red Hat to launch a broad effort to align open-source software with the emerging technical trends like containers and cloud computing and IoT, by linking those trends to business.  Do that, and IBM will become a giant again.