Could the Race to Acquire Big Switch be a “Big Switch”?

The story that Big Switch is being acquired by Arista is just a story…for now.  The story that there’s been a lot of M&A dialog around the company is pretty clearly true, and it may well be that it’s the interest overall that makes this important, not who ends up doing the deal.  Big Switch was the premier SDN controller company, so does this mean SDN is back, or what?  Is the interest in Big Switch itself an indication of a “big switch?”

Like a lot of startups, Big Switch rode an early hype wave (SDN, in their case) and found that the wave turned into a ripple before they could get an exit deal.  Like a lot of networking startups, and even public companies, looking for a new lease on life, Big Switch turned to hybrid cloud.  OK, you could say this was jumping from a hype ripple to a new hype wave, but there is some value underneath the move, and that’s probably why there’s broad interest in buying Big Switch.

I mentioned in my last blog that networks were largely divided into “interior” and “exterior” pieces.  Interior networks don’t have to worry about connecting users; they funnel traffic between aggregated sources and data center or cloud hosting points.  Even, in an increasing number of cases, among those cloud hosting points.  The delineation between the two kinds of networks is getting sharper because of hybrid cloud, and that’s important.

A VPN connects sites, some number of which are data centers and some users.  As the front-end piece of applications migrate to the public cloud (creating the most common example of a “hybrid cloud”), the site association and user individuality of the traffic disappears.  The cloud front-end hands-off traffic to the data center back end through a small number of portals.  In addition, the cloud front-end starts to become more interconnected; think microservices and service mesh.  The result is what we could call a “cloud virtual network”, except that in most cases the largest pool of network equipment in it is in the data center, the switches.

Can this somehow tie to SDN?  Well, the notion of SDN as a universal way of separating the control and data planes of a network and substituting central route control for adaptive route control, hasn’t shaken the world.  One big reason is that scaling this to the Internet was always questionable.  SDN and OpenFlow work fine in a switching mission, though, and software-defined switching has gotten decent traction.  If a hybrid cloud is a kind of huge, cloud-and-data-center-resident, virtual data center, then why not have a virtual data center network that sort of looks like SDN?  Which is what Big Switch had evolved to with its hybrid cloud positioning.

Juniper was/is a prospective buyer for Big Switch according to the rumor mill.  Juniper’s quarter shows that while routing is still the biggest source of revenue, switching is the biggest source of revenue growth.  The VPN and the consolidation of IP VPN and Internet applications onto the same infrastructure, have reduced router usage.  Could the notion of a cloud virtual network reduce it further, meaning push more and more traffic to “interior” paths where SDN-like technology would be very practical?

We’re not done stirring the brew of the cloud virtual network, though.  Remember that Cisco’s recent announcements seem to have opened the door for a kind of disaggregated vision of a network device, something that involved custom silicon (and P4 drivers to support it), software, and a hardware platform that could be a white box or a proprietary device.  It would seem that Cisco sees that same shift of emphasis to switching, and if the data center is spreading to the cloud, then switching does the same thing.

The question this raises is whether the cloud virtual network of the future is evolving from the cloud side of the hybrid cloud, or from the data center side.  Fierce Telecom’s article says that Arista, Dell, Cisco, Gigamon and Juniper were jousting for Big Switch, which aligns nicely with the rumors I’ve heard.  Arista is a “software-driven cloud network” player.  Dell is a data center equipment giant.  Cisco and Juniper are network vendors repositioning themselves for the switch-dominated future, and Gigamon is a provider of monitoring tools for virtual networks.  You can see how Big Switch fits into any of these spaces.

Today, the center of strategic influence in the enterprise is the data center.  Network policies are set by data center network policies, for the simple reason that the data center is where the bulk of the capital budget goes.  For many enterprises, the only WAN equipment they buy is the edge routers, and these are hardly strategic.  All of the hardware vendors in the game to acquire Big Switch are data center equipment giants of some sort.  Gigamon is an outlier, a virtual-network play that has perhaps a more natural fit with the notion of the hybrid cloud as a holistic platform and not a data center extension.

Either view has to be solidified; a virtual network is an abstraction for real stuff, not real in and of itself.  If the networking industry is shifting focus from VPNs and data center networking to a form of cloud virtual networking, then any player in either the cloud space or the data center switching space is threatened by others who might come up with their own approach, and in doing so own strategic control over buyers.  Big Switch could add enough horsepower to any of these vendors’ solutions to give them that strategic edge.

If the Arista story is true, which we probably won’t know till next month, the Big Switch deal is of particular value.  Data center switching is the piece of the networking market that’s most difficult in which to offer differentiation, as I pointed out when examining Juniper’s quarterly results.  It’s where open-model (white box and open-source software) devices are the most credible competition, as I pointed out when blogging about Cisco’s Silicon One stuff.  Arista is the up-and-coming in the space, and can least afford to be swamped by a wave of Cisco- or Juniper-led cloud virtual networking.

The critical piece in this tale, the one that could confound a lot of plans, is the fact that we don’t really know what cloud virtual networking is.  Is it something like SDN, grown out of the data center and into the cloud?  Is it something like SD-WAN, grown out of the VPN into both cloud and data center?  You could play Big Switch’s assets either way, but any of the vendors who don’t get those assets could mount what we could call a “definitional counterplay”, establishing a model of cloud virtual networking through clever positioning and perhaps some open-source elements.  The result could undermine the value of Big Switch to whoever gets it.

That risk may be why the value of Big Switch in the deal, rumored at only about two-thirds of what VCs have put into it, is lower than some expected.  There’s more to supporting the hybrid cloud than buying “Hybrid Cloud Within!” stickers for your gear and brochures.  Our industry tends to shoot their PR bullets before they build their supply lines, and if that’s the case here, then the winner of Big Switch could still end up being the loser.

A Deep Network Read of Juniper’s Tea Leaves

Juniper reported their quarter on Monday, and their numbers and commentary always offer an interesting perspective on the networking market.  This, even though you have to dig quite a bit to get past the traditional upbeat tone the company takes, no matter what’s happening.  On the surface, the results were mixed.  Revenue beat by a cent, profits missed, and guidance was a bit below Street estimates; nothing dramatic.  Deeper down, though, there’s some interesting stuff to be had, and not just about Juniper.

To me, the most significant truth you could extract from the call is that switching is eclipsing routing.  Revenues for the year were off by 4%, which is also what switching product were off.  Routing products were off 12% versus 2018, which the company properly attributes to the decline in the service provider sector, Juniper’s chunk of which dropped by 5% year/year.

The key story to take from the call is that “the Internet” and virtual networking is essentially killing the router market.  People used to build their own networks, buying routers and circuits.  Most of the network investment now takes place within the data center and at a thin boundary between the data center and the VPN or Internet.  This is true of both enterprises and cloud providers; only the service providers actually install deeper network technology like routers.  With that segment locked in a profit-per-bit decline, there’s simply no way growth in routing is going to happen.

Even without the profit-per-bit issue, virtual networks tend to consolidate equipment just as M&A tends to consolidate staffing.  A thousand big enterprises building their own networks from trunks and nodes would consume (according to my model) something like 150,000 routers.  The same number of companies could be supported on VPNs for (again, based on my model), about 11,000 routers (both numbers excluding customer premises edge equipment).

Routing is a much higher-margin business than switching, and routers are stickier products.  Switching is a fairly easy target for open-model networking, meaning white-box stuff, and even without that complication, switching is harder to differentiate.  Juniper is kind of puffing up the prospects in the switching space with 400G, which almost surely won’t be much of a market factor for at least another year, probably two.  Without a lot of organic growth in the space, open-model switching is likely to erode switching sales for Juniper and others in the space by 2022.

A second story from the call is that Juniper, like other vendors, is belatedly realizing that if you could magically reduce opex, you’d have more money available to spend on equipment.  The Juniper/Mist AI story, which Juniper frankly booted when they did the Mist acquisition, seems to be finally coalescing into a broader use-AI-to-reduce-opex story, but it’s still a kind of half-hearted position, and it may be too little, too late to matter.

It is very difficult for any network equipment vendor to now propose an opex-centric modernization approach with a broad scope, because they’ve ignored the issue for years and operators have been doing point projects to eat away at costs, picking all the low apples.  I think AI could have played a major part in zero-touch service-wide automation five years ago, but it would be harder to justify such a project today.

Then there’s the matter of scope.  Operations costs come from devices and users, both of which concentrate (obviously) at the edge.  Interior network equipment management, meaning management of the stuff Juniper sells, doesn’t actually contribute much to overall “process opex”, the direct and indirect costs of network operations overall.  My model says that only about a cent and a half of every revenue dollar is spent on interior-network opex, out of about 5 cents on overall network opex (which includes the edge technology) and 30 cents on process opex overall.  Operators are tending to look for higher-level operations automation, something an interior vendor like Juniper isn’t in a great position to provide.

I also wonder whether there’s some vendor bias against an effective opex position, given that even vendors with a full-spectrum product line seem to find the concept slippery.  None of the big mobile players have a good zero-touch operations strategy, even though mobile broadband is an easier target.  That there’s no solution in wireline broadband, which is under the greatest profit-per-bit pressure, is thus not surprising.

And part of the problem could be the next point from the call, which is software.  Software was up 25% year over year, and it accounted for about 12% of Juniper’s revenues.  For network vendors, though, software was the proprietary stuff inside their box.  In the last decade they’ve dabbled in further software excursions, with mixed success.  Juniper’s own past software M&A hasn’t been a shining financial light by any means (neither, by the way, has Cisco’s).

Juniper seems to mean both “in-box” and “off-box” when it talks software, but most of their software success might be better characterized as “stuck-to-box”.  The Contrail Fabric Management system is a good example; management software for the QFX portfolio is a good example.  These are helpful within the management scope of a switch-dominated data center, but that’s not where most opex is generated, and it’s less useful if there are either other vendors or open-model switching devices involved, because they interrupt the scope of switch management overall.

What Juniper needs, and what all its competitors need, is a better overall understanding of, and position in, the “higher-layer” network software space.  Operators need to do more than cut costs, they need to augment revenue.  Network equipment vendors understandably paint “new revenue” to mean “new connectivity services”, meaning different ways to get people to pay for bits.  Revenue per bit has been in a decline for at least a decade despite the initiatives vendors hoped would arrest the fall, and that’s not going to change.  Bits have to cease to be the only product of a network operator, and vendors have to address that.

My recent blogs on the issues of NFV point out that while everyone seems to admit the need to somehow get on board the “cloud-native” bandwagon, nearly everyone sees it as a marketing move and not a technical transformation.  NFV, which postulates the reduction in cost per bit by reducing capex associated with device-based networks, has created no noticeable impact on capex.  What it should be doing is addressing what “network functions” are needed above the network, to be a part of new non-connectivity services.

Networking isn’t just another application, so it’s not completely realistic to assume that the cloud/Kubernetes community will come up with an approach to architecting network functions for these higher-level services.  It’s also apparently totally unrealistic to assume that network operators will do that, given their total failure to get any useful results from their own initiatives.  That leaves the network equipment vendors, including Juniper.

It also brings out the final point I think the Juniper earnings call exposes.  Cisco, Juniper’s greatest rival, has obviously read the handwriting on the wall in networking and in particular in service provider networking.  They’re accepting the commoditization of hardware and trying to work beyond it.  Juniper doesn’t seem to take that same position; they still bet their earnings growth on traffic trends.  Users suck; suck bits, that is.  Operators are responsible for providing suckable bits, regardless of whether it’s profitable.  Juniper will then sell to them; end of story.

Cisco’s stock is up a bit over 4% for the last year, as of when I write this blog.  The NASDAQ ETF for big tech, QQQ, is up over 35%.  Juniper is down 17%, which shows that Cisco is probably right thinking more broadly about its products and services in the future.  It also shows Juniper needs to rethink its box-and-standards-biased product plans.

Is Conflation of “Containerized” and “Cloud-Native” a Big Problem?

Light Reading did an interesting article asking for “a little less conversation, a little more action” on cloud-native VNFs.  I agree with a lot of what the piece says, particularly about the fact that the market and technology aren’t really ready for cloud-native VNFs, and that the Linux Foundation is eager to try to close the gap.  I’m less sure about why the gap exists, what would actually help close it, and even what the conversation is about.

There seems to be a tendency in the market to conflate “containerized” and “cloud-native”, and that is no more useful than conflating virtual machines and cloud-native.  One of the big problems with our whole discussion of VNF evolution is linked to this conflation, because it leads us to fix the wrong problem.

A containerized application is one that is designed to be deployed as a series of containers, by an orchestration system like Kubernetes.  You could absolutely, positively, now, and forever make virtually any monolithic application, the stuff we run today in single servers, into a containerized application.  You most definitely do not have to be cloud-native to be containerized, and my own input from enterprises is that most container applications today are not cloud-native.

A cloud-native application is an application divided into scalable, resilient, components that are designed to take advantage of the cloud’s inherent agility.  They are typically considered to be “microservice-based”, meaning that the components are small and don’t store data within themselves so they’re at least semi-stateless (purists would argue they rely on externally maintained state).  It’s likely, given the state of the market, that cloud-native applications would deploy via containers, but it’s not a necessary condition.

NFV, at the moment, isn’t either one of these things, so perhaps in one sense this whole conflation thing could be considered moot.  Why not refine two things you’re going to say are bad into one single bad thing?  Answer: Because the fixes for the two things are different, and it’s not even clear how much either fix would matter to NFV anyway.

NFV latched onto virtual machines and OpenStack as the deployment model from the very first.  A physical network function, meaning a device, was to be replaced by a virtual network function (VNF), which was software instantiated in a virtual machine via OpenStack.  Sadly, the model that the NFV ISG came up with was actually more general; the deployment process was to reside in a Virtual Infrastructure Manager, or VIM.  The VIM could have been seen as the abstraction that bound virtual hosting as described by NFV Management and Orchestration (MANO), presumably via some sort of model, to actual resources.

The slip from the path seems to have come about largely because of two factors.  First, a virtual function as an analog of a physical device is a unitary function, and it demands absolutely the best security because it’s part of a network.  Hence, VMs.  Nothing in the abstract view of a VIM precludes using VMs, or OpenStack, but apparently the thought was that not precluding it wasn’t enough, you had to mandate it.

And, just maybe, they were right.  If you build a virtual network by assembling virtual boxes that are 1:1 equivalent, functionally, of devices, then you might need the security and resource isolation that VMs can bring.  Containers are not as secure as VMs, though the difference might not be difficult to accommodate.  Containers do put you at risk for more crosstalk between co-hosted “pods” because they share the same operating system and some middleware.  The biggest advantages of containers that proponents of “containerized network functions” cite is the fact that you can get more of them on a server, and that since there’s only one OS there’s also less operations effort.  Those might not be worth much if you’re managing services not servers, and if your total container throughput limits how many you could stack on a server anyway.

If containers aren’t necessarily the natural goal of VNF hosting, how about cloud-native?  It would indeed be possible to take something like “routing” and turn it into a set of microservices.  That would mean creating separate elements of the overall VNF, then linking them in a service mesh.  If you did that, many of the control, management, and data processes that would have flowed across a router backplane are now flowing through a network connection, with its bandwidth limitations and latency issues.

Where we can make a valid connection between the two concepts is here; if you could somehow make a cloud-native router, you’d almost surely want to use containers in general, and Kubernetes and its growing ecosystem in particular, as its basis.  If you don’t do that, and thus unite the two concepts into one effort, you probably don’t get far with either one of them.  Which means that the Linux Foundation and similar software-centric efforts may be aiming at the wrong goal.  We don’t want cloud-native or containerized VNFs at all; we want no VNFs.

If a VNF is the virtual form of a PNF, we’ll never make it cloud-native, nor will we containerize many of them in any useful way.  The only truly “successful” NFV application today is virtual CPE, which involves hosting a VNF inside a white-box device.  This mission doesn’t gain anything from containers, but it really doesn’t gain much from NFV overall.  The DANOS and P4 initiatives would be a far better way to address vCPE.

The hope for NFV lies elsewhere, in taking a different perspective from the one taken when the ISG launched.  What has to happen is something that actually came up in the very first NFV ISG meeting I attended, the first major and open one held in the Valley in the spring of 2013.  One attendee proposed that before we could talk about composing virtual functions into services, we had to talk about decomposing services into virtual functions.  Wait, you say!  Didn’t we already say that a VNF was a transmorphed PNF?  Yes, and that’s the problem.

Anyone who’s ever architected a true cloud application knows that the last thing you want to do is make each component a current application element.  You’d inherit the limitations of data center computing because that’s what you were starting with.  It’s possible to build a VNF-based service correctly (Metaswitch had one in 2013, with their implementation of IMS-as-a-cloud “Project Clearwater”), but you have to divide the service into cloud-native functions, not into devices.

We could jiggle NFV’s specifications to make it possible to deploy VNFs in containers, by doing nothing more than abandoning the presumption that a VIM always invoked an OpenStack/VM deployment.  We could create cloud-native VNFs by abandoning the silly notion that a VNF is a software-hosted PNF.  But these are first steps, steps that abandon the wrong approaches.  The next step, which creates the right approach, would also unite the two threads.

Almost everything that NFV has done could, at this point, be done better by presuming Kubernetes orchestration of a combination of container hosts, VMs, and bare metal.  That’s already possible in some implementations.  What NFV needs to do is accept that model as a replacement for everything.  Does that mean starting over?  No, because we don’t need to start at all.  Kubernetes is already there, and so all that’s necessary is to frame the requirements of NFV in the form of a Kubernetes model.  Then we need to construct “services” from “network-microservices”, like we’d build any cloud application.

The article makes a particularly important point in the form of a quote from Heavy Reading Analyst Jennifer Clark. “It became clear working with virtual machines in NFV was not really going to work in a 5G environment, and where they really needed to go was toward containers, and cloud-native Kubernetes to manage those containers.”  That’s true, but for a different reason than some automatic connection between 5G and containers or cloud-native.  Mobile networks tend to have quite distinct control, management, and data planes.  This means that the part of networking that looks like an application—control and data planes—are separated and could easily be made containerized and cloud-native.

Fixing this within the context of NFV discussions like the Common NFVi Telco Task Force (CNTT) is going to be difficult, because standards seem to have a fatal inertia.  Marching to the sea is not a good survival strategy, but lemmings (it is said) have the fatal instinct to do just that.  Marching to the legacy of NFV is just as bad an idea.  Rather than pushing for a standardization of NFVi, which presumes that the higher layers that control it are doing their job, the body should be asking why you need to standardize a layer that’s supposed to be abstracted fully by the layer above.  If the NFV community wants to work on either containerization or cloud-native migration, they should start by fixing the VIM model, because without that, nothing else is going to help.

If VIM is the problem, then NFV is really about how service models are transformed into deployments.  That means it’s about how you model a service, how it’s designed as a series of interactive components, which is what “cloud-native” is about.  It’s then about how those components are deployed and redeployed, which is what Kubernetes is about.  CNTT or NFV shouldn’t be about making NFV compatible with Kubernetes, it should be about making Kubernetes the model of NFV, and mobile services, 5G, IoT, edge computing, and a lot of other initiatives are going to eventually do that, either by accepting NFV as a Kubernetes application or by letting Kubernetes-based deployments of service features eventually subsume NFV.

Mobile and 5G is where this starts.  Is this a surprise, given Metaswitch’s seminal work with IMS, another mobile service application?  Is it surprising given that SDN separates the data and control planes?  Why is it that so much insight in networking is lost because it happens outside some formalized consideration of issues?  We don’t live in a vacuum, after all.  We live in the cloud, which is the whole point.

Could MMPG Play a Role in Everyone’s Future?

Is gaming a driver for networking?  I’ve been seeing articles that link multiplayer online gaming with edge computing, gigabit Internet service, augmented reality, and a bunch of other things that people are really interested in.  The question is whether the interest would be enough to actually drive incremental networking opportunity, and from that drive infrastructure changes.  Maybe even telco revenue?  We’ll see.

One thing that makes the topic of multiplayer game opportunity complex is the fuzzy definition of the term.  My own view is that there are three models of “online game”.  The first model requires the user have an account or sign in for ad sponsorship, and the game logic is still local to the user, meaning in the user’s device.  In the second model, the user actually plays a cloud instance via the Internet, and in the third model a community of players inhabit a common game instance, reacting with “stock” elements of the game and also with each other’s avatars.

Estimates of the massively multiplayer game (MMG) market range from around $20 billion to $40 billion, depending on just what models of online gaming are actually included.  For purposes of this piece, I’m looking only at the third model, because only concurrent multiplayer gaming really seems to be a credible driver to improved user-to-game connectivity.  The number of regular players peaked in 2014 at about 15 million (but it’s difficult to get a good number here because users set up multiple accounts), according to my model, and has been declining very slightly ever sense, to a current level of about 12 million.

All MMGs require that players control a “character” or avatar, and it’s this control path (both to and from the game) that creates any special communications needs.  If players move their avatars, the movement must be communicated from player to game, and the result back to all the players who have at that moment the subject avatar in their view.  In addition, any non-player avatars or changes in scene have to be communicated visually to any player whose character could “see” or be impacted by them.

Any latency that accumulates in the path from player to game and back will create a delay in the experience, enough of which is disruptive to the immersive quality of the game.  If some players have lower latency than others, the latency advantage could give them an opportunity to react to something before their online competitors, in much the same sense as low latency gives high-frequency stock traders an advantage.  Similarly, of complex scenes have to be rendered out to each player, the bandwidth of the path would introduce a delay that has the same effect as propagation latency.

How does all of this impact sacred network industry cows like edge computing, augmented reality, or access bandwidth?  First, it’s important to note that gamers who like MMGs are typically competitive, and so it’s arguably more important whether someone perceives a technology as giving them an advantage than whether it’s really measurably true.  Thus, we need to look at both the “real” and “social” benefits, the bragging rights.

If we visualize an MMG as a single process running somewhere, in a data center with massive resources, then the latency seen by the users will be related to the number of trunks/hops transited along the path to and from that center.  Users further from the game would be at a disadvantage. My own experiments with broadband Internet services suggest that hop count has a greater impact on round-trip delay than access speed, as long as the access connection is at least in the range of 25 Mbps.  Thus, faster Internet wouldn’t really make up for unfavorable hop-count.

Suppose the MMG system is distributed?  Geographic processing centers for the game, perhaps as much as one in each metro area, would reduce latency for all users, and as total latency is reduced, the contribution of the access connection becomes more significant.  There is some model indication that in a game centered in a metro area, users running 25 Mbps connections would be at a disadvantage in at least highly changeable scenarios, versus a user with a 1 Gbps connection.

Could we drop the notion of a monolithic, central, game system?  That’s complicated because of the notion of “context”.  A player’s character/avatar “lives” inside a context, which loosely speaking is the area around the character that represents line of vision/influence.  That context might be entirely generated by the game, or it might be shared with other players.  These other players could be just passing through, playing an active role, or even being central to the context.  In the limiting case, there could be a context for each player, if none at the moment could “see” one another.  Even in this case, the game would have to be able to tell whether players had moved so that their context now overlapped.

This framework would have to be fit with edge computing, or any other form of distributed computing.  If two characters were to enter combat with each other (or together against game-generated characters) the game could distribute a component to an edge point, providing the two were in the same general geography.  The more people you involve in the shared context, the more complex the problem of finding a “good” place to host it.

A context-centric view of gaming could mean that each player had a hosted context, and the player then interacted with it directly.  It could be edge-hosted or even hosted on the player’s system or device.  This shared context would then exchange context-update information with other contexts that, for the moment, overlapped with the player’s own.  Context synchronization could be loose if the players involved were not directly interacting, but would have to be tightened if the players were, for example, engaged in combat with each other.

The player-to-context connection would now be the major source of latency, which means that you could reduce latency by increasing broadband speed.  The more intense the player-to-context interaction, including augmented reality, the more broadband speed would matter.  In short, there would be a growing practical benefit to “better” broadband and edge computing.

If players interact with their own context, it’s easy to make that context local, meaning that it would admit to the possibility of edge hosting.  Connecting context hosts would be required for synchronization, but this would involve edge-to-edge pathways that would certainly not be access connections, and might not even be over the Internet.  You could consider this kind of gaming as a cloud computing application, with the interior pathways private to the cloud provider.

Today’s gaming models don’t really optimize distributable, context-centric, game implementations.  It seems possible that if a game were designed to be distributed, and in particular designed to be distributed to the edge, the experience of gaming could be improved.  It’s also possible that in this situation, edge computing, low latency, and augmented reality could be added to the picture.  In other words, gaming could be a credible consumer of edge computing and gigabit broadband, even faster 5G.

That’s not the same as being a credible driver, though.  Gaming tends to run in fad cycles, with a game growing in player count as it becomes popular, because it’s different.  It loses player count as players move on to other, newer, things.  The question is who would be willing to step up and invest in both a distributable gaming architecture and the edge resources to host it.  Can you see a network operator presenting a gaming business case to a CFO?  “Game them and they will come?”  Probably this is the sort of thing that Amazon or Google (who you recall is already diddling with gaming) would have to do.

If some player does do it, though, it starts an avalanche of follow-ups.  The nature of MMP gaming, as a socially driven activity, guarantees that once some game offers this approach everyone else would have to follow.  So, boldness on anyone’s part might bring about a real near-term benefit…to operators in the form of faster (presumably higher-priced) connections, and to cloud providers in justifying a real-time edge-centric view of the future.

There’s not much difference between a gaming model and a worker point-of-activity model, either.  There’s not much difference between a distributed gaming application and a cloud-native distributed, scalable, application either.  We are moving toward the software framework of real-time service, with or without gaming.  Gaming could, just could, make us move faster.

Connecting the Telecom Food Chain

What happens if the food chain breaks?  Anyone who’s studied biology knows that life is a complex ecosystem arranged in a kind of pyramid, where the stuff at the bottom is eaten by the next layer up, and so forth.  Disruptions in the food chain lead to imbalances in the ecosystem; look at the relationship between lemming populations and fox populations.  Suppose that one of the layers simply disappeared?  Does everything die off, do you create two independent chains at the break, or what?  It’s not a biology question for me, of course, it’s a telecom question, especially for wireline broadband ISPs.

Cisco has a sponsored piece in Fierce Telecom, and the title “Cisco’s Internet for the Future Vision Redefines the Economics” seems to be admitting that the economics need redefining.  To quote a specific point: “This has forced new engineering innovations that will provide the methods to enable the construction of the next phase of the Internet, dramatically improving capital investments and making operations far more efficient than what is currently available.”  While Cisco dances around the point, it sure sounds like they’re saying that the Internet has problems with profit and ROI.  That’s true, and the roots go way back.

Telecommunications was historically a form of a regulated monopoly (in the US for example) or even a part of the government (the “postal, telegraph, and telephone” or PTT bodies in Europe).  In the ‘80s, a hungry capital market encouraged things like long-distance competition, and eventually (in 1984 in the US) there was a break-up of the old model.  The breakup created the first break in the food chain of telecom.  Long-distance services are “interior” services; they connect to users through the local exchanges.  The local side is where the costs are because it’s where the touch is, where individual customers have to be visible and supported.

While this was going on, the transformation of telecom technology to digital form was raising the ugly specter of commoditization.  It took 64 kbps to digitize a voice call.  Modems running over voice services had capacities an eighth of that, perhaps, so obviously you could get more data by using the underlying digital channel.  Businesses started to buy “DDS” 56 kbps services, and even services that used higher layers of the digital trunk hierarchy—T1/E1, T3/E3, and SONET/SDH.

The problem is that there aren’t enough businesses, and there were even less that actually needed those higher speeds.  In the US at the time, there were about 7 million businesses registered.  Of these, only about 150 thousand were multi-site businesses, and of those only about half were actually networking their sites.  The consumer was just talking, and telcos worldwide yearned for a model that would encourage consumer “data” connectivity.  You could argue that ISDN and ATM were at least in part designed to open the consumer market to data.  The problem was that point-to-point data was of no interest to consumers, which is where the Internet came in.

The Internet, from a telecom revolution perspective, is the worldwide web, a development of the ‘90s in terms of practical adoption.  Originally, users dialed into vast modem banks to get web access, but by the end of the ‘90s there were broadband (digital) services; in the US, mostly from cable operators whose CATV infrastructure was pretty much data-ready.  DSL followed quickly.

The challenge this all created is that from the first, the telcos (and cablecos) were not what the user wanted; they saw “the Internet” as a vast sea of web servers hosting stuff they were interested in.  Their “ISP” was just a conduit to it, a necessary cost that they’d love to see reduced to zero.  A friend’s teenage child once asked me “Why do I have to pay for AT&T Internet when I use Google?”

This (to be kind in characterization) “unrealistic” view was further promoted by the concept of ad sponsorship, something we already had in television.  If you think about online advertising, there’s a fundamental truth, which is that you can’t sell eyeball space if nobody is looking at you.  A bit pipe is not an eyeball attractor.  Early in the Internet game, the IETF actually took up the problem, and I co-authored an RFC on “Brokered Private Peering” which the leading Internet publication of the time (Boardwatch) thought addressed the problem of settlement among stakeholders in the Internet world.  That problem, they believed, would eventually bite Internet growth.

What was discussed in those early days was payment for retail ISPs when a content resource or “website ISP” peered with them for customer access.  These payments would have made retail broadband more profitable to providers, but of course would have made the OTTs less profitable.  Nobody loves a public utility, everyone loves free, and VCs love new companies rather than making old ones more profitable, even if eventually the profit challenge for the older companies would curtail further Internet growth.  “Hey, I’d have made my hundred million by then!”

It’s hard to say if the no-settlement or “bill and keep” model was good public policy overall.  It probably contributed to early Internet growth, but it may also have contributed to an explosion in failed ventures.  Internet regulatory policy has been all over the place, and still is, and its stance on settlement of this sort is murky.  There were times in the US when settlement was explicitly prohibited, and other times (now) when it’s not really clear what the policy is.  The current scheme seems to be working, in that telcos aren’t going out of business in droves and OTT innovation continues, but there are definitely stress cracks to consider.

In the US, many of the original Bell companies have been selling off areas to new players.  Frontier Communications, one of those who have acquired these lines, is now expected to file for Chapter 11.  Overall, the rural subsidies program (RUS) that’s been boosting broadband in rural areas, has had a hard time sustaining the players in the space.  The reason is simple; if you’re an ISP trying to offer broadband in less-populated areas, you have the odds stacked against you because of demand density and access efficiency.

Telco/cableco return on infrastructure is highest where demand density (roughly, GDP per square mile) and access efficiency (right-of-way density in demand areas) are highest.  Where they’re low, it’s harder for the ISP to earn a return on infrastructure.  The industrial country with the lowest demand density and access efficiency is Australia, who you may recall embarked on a kind of public broadband network plan called NBN.  See this Light Reading article for how well that’s gone.  Of course, the statistics for Australia are unusually bad (Australia’s demand density is 20% of the US and access efficiency is 46%), and it’s difficult to say just what a critical level for either would be.  We need to try to work in some real-world metrics.

Broadband speed is a good measure of profitability for an ISP, because it costs more to offer it.  On one chart of top Internet speed by country, the US ranks 15th.  All of the countries that rank higher have relatively contained service geographies, which tends to raise both demand density and access efficiency.  Australia ranks 50th, and Canada (which also has lower demand density and access efficiency than the US) ranks 25th.  Spain’s numbers, in my combined metric, are slightly better than the US numbers, and they rank 13th.  The top 10 on the list all have combined metrics four or more times the US.  What this proves is that natural markets do behave as the combination of demand density and access efficiency predicts.

What my numbers don’t show is what to do about this problem.  I think it would have been easy to solve it 40 years ago at the dawn of the Internet age, when you’ll recall it was first raised.  Today, there are many public companies who depend on the current Internet model, and many consumers who depend on the result.  Changes at this point would not be easy.  Telcos have generally failed in launching their own profitable OTT businesses.  Do the ISPs start buying OTTs?  That’s not worked great either (look at Verizon with Yahoo and AOL), but there does seem to be some merit in that approach.

It does seem clear that if my combined metric gets bad enough, the result is a true destabilization of wireline broadband.  Even where it’s not bad at all, there are still indications that operators will look for other investments, foregoing modernization in their own areas or (as I discussed in a previous blog) getting into other fields…like banking.  There probably isn’t a major risk of any big wireline player failing, but there are clear indicators that geographies that don’t have favorable financial metrics are already suffering from under-investment.

I used to be hopeful that this problem could be fixed, but my modeling is increasingly pessimistic about a proactive solution.  The most likely outcome is that we’ll muddle along as we are, and that the market will slowly evolve under current (and future) pressures.  More and more content will be produced and distributed via for-fee sites.  Linear TV will be increasingly displaced by streaming, and without channel lineups many of the less-popular networks will fall away.  We won’t see as much improvement in the Internet and broadband as we’d like, but the fact is that what we have now is (for many) plenty.  I have the low tier of FTTH at home, and I’d run out of people to watch content before I ran out of capacity.

Does Cisco have the answer?  I think that Cisco is reacting to the open-model network revolution the operators are attempting to sponsor, recognizing that a lack of adequate return on infrastructure is going to push vendors into making white boxes if they don’t figure out how to differentiate themselves in an open world.  Their current statements don’t reflect a solution to the problem of ROI, but they reflect a plausible step in an age where network operators hunger for any vision.  We may see whether it’s enough of a step as early as the end of this year.  Will it be in the form of M&A successes, or operator Chapter 11s?  Whatever it is, it will set the tone for 2021.

Did IBM’s Quarter Just Prove their Hybrid Cloud Strategy?

Perhaps the most important rivalry in IT is the one between Red Hat and VMware.  If the future is the hybrid cloud (which it is), and if realization of that future involves the creation of a unified hybrid-cloud platform for development and deployment (which it does), then these are the players who could provide it.  Red Hat was acquired by IBM, who released its first consolidated earnings last night.  The numbers were good, but to decide who’s ahead in that great strategic rivalry, we need to look deeper.

IBM beat modestly on revenue and EPS, and guided higher than expected for the year.  Red Hat was up 24% year over year, and that was a major contribution to IBM’s position overall.  However, IBM’s own stuff, including its AI and cloud, was also up.  Even this tantalizing summary raises some interesting questions, the second-most significant of which is whether Red Hat’s growth was in any way related to the IBM acquisition.  The most significant, of course, is whether IBM actually bought Red Hat for strategic reasons it’s prepared to develop.

In their earnings call, IBM’s Jim Kavanaugh, Senior Vice President and CFO, said “the next chapter of cloud will be driven by mission-critical workloads managed in a hybrid, multi-cloud environment. This will be based on a foundation of Linux, with containers and Kubernetes.”  This is as good a summary of the real-world IT future as we could hope for.  IBM’s past success was generated by two things, a perception that it had the best strategic grasp in the industry, and its influence in major accounts.  IBM’s decision to essentially exit the hardware space (except mainframes) reduced its influence in both depth and breadth.  Their lack of a clear cloud strategy was obviously proof against their strategic grasp.  Lose-lose.  Maybe they’re getting that back.

IBM has clearly made strides in exploiting Red Hat, not just milking it for revenue.  Their entire cloud strategy has turned to focus on OpenShift.  They’re introducing Red Hat into professional service relationships, with more than double the number of engagements versus the last quarter.  This statistic bears out my limited data on the source of Red Hat’s upside this quarter; most of IBM’s initiatives have only started to bear fruit, and so the majority of their wins in the new hybrid cloud arena are not going to be seen till later this year.  I think this is why IBM has been cagy regarding future guidance; they offered an “at least” EPS upside in the earnings call.

Another interesting data point from the call is that their cloud revenue was up 23%, and IBM says they’re focusing on the “financial cloud”, targeting a market segment where IBM has largely retained its strategic influence.  This indicates that IBM actually has a plan to maximize near-term symbiosis while they prepare more strategic options.

The likely vehicle for that is the “Cloud Pak”, which IBM announced last August.  The package was always intended to be a form of a cloud PaaS, an ecosystem that could support unified hybrid cloud development.  It was missing any realistic model of hybrid cloud deployment and management until the Red Hat deal, and OpenShift has now filled that in.  It would be premature to say that IBM and Red Hat had created that unified platform for hybrid cloud that I’ve been saying we needed to get, but they obviously intend to do just that.  To quote the call, “Cloud Paks bring together IBM middleware, AI, management and security, and Red Hat’s OpenShift platform.”

They actually could do more than that, though that’s plenty.  In fact, what they “could” do is both an opportunity and a risk for IBM.

The opportunity side is that Cloud Paks now become an ecosystem-as-a-product offering, something that CIOs can look at as the total solution to their hybrid cloud challenge.  The early focus on financial services, which plays as I’ve said to IBM’s strategic influence strength, means that Cloud Paks are likely to get early traction, which means that they could become a de facto approach to hybrid cloud unless somebody else (VMware, obviously, would be a candidate) makes a major and successful counterplay.

Another opportunity for IBM is that Red Hat’s software is often integrated with open-source applications, both horizontal and vertical, through partnerships.  It would be easy for IBM to leverage Red Hat’s position here to create a quick set of Cloud Pak application solutions.  These could then seed Cloud Pak through a broader market, perhaps even broad enough to overcome IBM’s loss of strategic breadth resulting from exiting most of the computer business.

The single most important theme I get from the call is that IBM is now seeing the hybrid cloud as the virtual computer of the future, and they are on the way (via Cloud Paks) to defining its architecture.  Everyone can have one, use one, depend on one.  That’s the subliminal message, and the message that creates the biggest upside for IBM.  If they play this right, then they own hybrid cloud.

Which opens the question of the “risk side” of this.  If IBM boots this, then they discredit themselves, discredit Red Hat, and probably end any chance of being a leader in hybrid cloud, or much of anything else.  This is the ultimate desperate toss of the dice, the all-in bet, and there’s a very good reason why it might not work…NIH.

Cloud Paks are credible hybrid frameworks for one reason, which is OpenShift.  While there are things IBM can bring to the table to enhance them, the technical platform of Cloud Paks is either OpenShift or it’s irrelevant.  Right now, OpenShift is the last in the list of Cloud Pak components; “IBM middleware, AI, management and security, and Red Hat’s OpenShift platform.”  Functionally, it’s the piece that matters, and the question is whether IBM’s organization can accept that.  All the IBM contributions to the Cloud Pak universe have been part of it, and of IBM, for a long time.  They didn’t shake the earth in that period.  What’s new is OpenShift, and that’s what needs to be the focus for IBM now.

Not Invented Here (NIH) has killed more M&A than I can remember, and crippled even more.  Alcatel and Lucent fought each other more than the competition for years and years after the merger, and as a result lost some monumental opportunities.  IBM and Red Hat can’t afford that kind of relationship.

Here’s proof positive of that risk from the call transcript: “As I mentioned, IBM’s Cloud Paks include the OpenShift platform, and so as we sell Cloud Paks, this drives additional Red Hat OpenShift revenue. The transactional nature of Cloud Pak sales accelerated the revenue growth of OpenShift and total Red Hat, reflecting IBM’s seasonally strongest quarter.”  See how Cloud Paks are driving OpenShift revenue?  Baloney.  It’s OpenShift that can make Cloud Paks the platform of the future.

The primary speaker for IBM on the call was their CFO, which isn’t usually the case for the “technical meat” of earnings calls.  We could read this as a positive; CFOs are interested in the bottom line, and because they’re not on a product team, they suffer less from NIH.  We could also read it as a negative; the CFO might be talking because none of the product-line executives were willing or able to.  The implications of that don’t need to be laid out.

I’d like to see this work for IBM and Red Hat, because I’ve worked with IBM and its gear for my entire professional career and because I think they have a shot at giving the industry what it needs, which is a total hybrid cloud platform solution.  They have a shot, too, unless VMware gets itself moving in a total strategic direction.

The Technical Issue of the New Decade

Having commented negatively on research reports in my last blog, I want to try to overcome their specific issues and summarize, based on real data, the likely market trends for 2020.  Wall Street and (to a lesser and less timely way) government research provides some numbers I can push through my model to see if anything reasonable emerges.  Here are the results.

We are clearly in a broad deceleration in both IT and network spending at the broad market level.  You can see this in the company reports and in Wall Street financial research that reports spending and trends based on financial reports.  There seem to be two factors involved in this, one that’s been going on for some time and one that’s fairly recent—late 2018 into 2019 and beyond.

The long-standing issue is one I’ve noted before.  For almost a decade, both enterprises and network operators have been under pressure in return on infrastructure spending.  The operator issue is manifest in the “profit per bit” squeeze, and for the enterprise the problem is a lack of direct connection between productivity enhancement and IT spending.  I’ll focus on the latter of these two here, since I’ve blogged recently on the operator profit-per-bit issue.

An analysis of government data shows that IT spending growth relative to GDP growth follows a cyclical (sine wave) pattern.  When a new IT paradigm that taps into new benefits emerges, spending growth rises until those benefits are realized, then declines as budgeting becomes conservative and “refresh” oriented.  There have been three distinct cycles since the computer age began, and we fell from the peak of the last cycle right around the millennium.  Since then, while we’ve had significant technical advances in both hardware and software, those advances have yet to create that direct-to-benefits connection.

Without an incremental source of productivity benefits, the focus of buyers is to enhance the capacity of their current IT platform at a lower cost.  Obviously, a lower cost means (at the minimum) reducing capital budgets slightly, in term of dollars per unit of computing power.  What we’re seeing recently is a further push for spending reduction, to the point where about a third of the CIOs I recently talked with say they are actually projecting zero or negative IT spending growth in 2020.

The recent driver of spending change has come as a result of the cloud.  It’s not as simple as saying that “businesses are moving IT to the cloud”, because anyone who works for an enterprise IT organization knows that’s a vast oversimplification.  What’s really happening is the result of a complex dynamic of economy of scale and planning inertia.

Real-time applications, things like order entry, online banking, and the like, tend to have two components.  One is the “back-end” piece that actually updates databases and creates ongoing reports and analysis for management.  This part is typically highly sensitive to the company, its “core business applications”, and is very rarely even thought of as a cloud application.  In fact, my analysis of some user data shows that it would generally be more costly to run these in the cloud because of the way cloud traffic and data access is priced.  The other is the “front-end” that presents the user interface.

There’s a lot of think time in a transaction, from a user perspective, and often a lot of interaction with software simply to set up for what’s being done.  This activity can make up, according to CIO data I got, about a third of the total processing time required, and it scales directly with the pace of user activity.  Transaction processing, in contrast, makes up only about a quarter of back-end activity, the rest being reporting and analytics.

Enterprises figured out quickly that the dynamic of front-end processing fit the cloud model quite well, and as a result, the majority of things “moving to the cloud” have not been applications, but rather the front-end visible part of applications.  You can see this in the shift in emphasis in enterprise programming languages, too.  We’ve gone from a time when c and C++ dominated to a time for JavaScript and Ajax and Python.

As you move front-end stuff to the cloud, you create an uptick in spending for the cloud data center infrastructure, the so-called “hyperconverged infrastructure” or HCI.  You also begin to build headroom in data center capacity, from the offloading of what constituted a third of overall processing.  This reduces incremental infrastructure need, and this is what’s dominated the general negative trend in enterprise IT spending.  It will continue to be a factor in 2020 and beyond…at least through 2022 according to my model.

Where economy of scale comes in is at this point.  The cloud, as a pool of resources applied across many companies and industries, is more efficient in handling variability of workloads.  Thus, if a hundred units of processing power are shifted to the cloud, they’ll consume on the average (according to my model) about 83 units of cloud-power.  If the cloud wasn’t, in its steady-state condition, generating a significantly lower unit cost of processing than the data center, it wouldn’t be worthwhile to move there at all.  If it is more efficient, it follows that cloud infrastructure spending will be below the level of data center spending it’s offsetting.

The complicating factors in all of this are 1) that the early front-end targets are applications not yet modified for mobile/web front-ends, and 2) that data center modernization and refresh still operates on the data center infrastructure.  Almost all the early front-end “migration” to the cloud wasn’t a migration as much as a redevelopment.  In 2020, my model says that new front-end cloud development will come half from new projects and half from actual transmigration.  Thereafter, of course, more and more will come from actually moving something out of the data center.  That doesn’t totally trash data center spending, but it does reduce growth, which is what we saw in 2019 and will see in 2020.

On the networking side, the largest contribution to business spending on network infrastructure is in the data center (switches), followed by branch connectivity devices, CPE.  Since VPN services and SD-WAN services have displaced router-and-trunk self-built networks almost entirely, everything focuses on the VPN edge in infrastructure terms.  CPE is relatively immune from major shifts in requirements, so the largest incremental source of spending is in security, exactly what we’ve been seeing.  However, even that eventually reaches maturity, and that will happen down the line, again likely in 2022.

The overall impact of this is a broad shift from IT and networking dominated by hardware, to a software-centric planning vision.  The major evolution CIOs face isn’t one of “virtual machines” or “containers” in the cloud, it’s one of developing applications to optimize an agile container-centric model of deployment that is both cloud-friendly and also more capital- and operations-efficient in the data center.

Agile applications need to be developed and deployed as agile applications, which is a larger shift than simply deploying in the cloud versus the data center.  While there are plenty of tips and techniques for this, including the microservice-and-mesh model, these are really applicable primarily to front-end components.  This means that efficient front-end cloud development is really a rewrite, which hampers transmigration.  To ease things, it would be helpful to have a development model that was suitable for both cloud and data center, which I think would mean abstracting the notion of a “component” of an application to be either a co-loaded and co-resident component for the data center, or a microservice-and-mesh one for the cloud (and of course, things in between).  We don’t have that model yet.

The same thing is true on the deployment side.  A service mesh of microservices is a truly awful way of doing something like crunching through a mass database; the network-connection of components would introduce enormous accumulated delay.  Kubernetes has means of both targeting specific hosts (nodes) with containers (pods) and avoiding them, and similar means should be provided to allow Kubernetes to create an efficient co-residency where latency between components is a critical issue, and also to provide distributability for load balancing and resilience where it’s more valuable than low latency.

Hardware vendors, particularly server vendors, have been slow to address the data center and application evolution trends, which leaves them to the software giants, notably Red Hat and VMware.  It’s these two companies that, in my view, will frame the future of IT, because the kind of tools they provide and the pace at which they’re provided will set the timeline for IT evolution.

It’s also likely that these software giants will determine whether there is indeed another productivity wave, something to drive the fourth of the IT spending cycles that in the past created industry momentum.  Hardware is passive from an application perspective, you just run stuff on it.  Even platform software is partially passive; as long as it frames our new application development and deployment model, it’s done its job.  The thing that’s never passive is application software, and the vendors who define the model of application-building are in the best position to develop or partner to encourage those new-cycle applications.

It’s hard to identify (reliably, at least) a deep driver to all three of the past IT spending cycles, but in the roughly ten years since I first made that presentation to Merrill Lynch financial experts who’d been called in from all over the world to hear it, I’ve come up with a notion.  The simple truth is that the closer IT and its information resources move to the work being done, the better the return.  Better return, ROI, means more investment.

The trends that could move IT closer to work and workers are mobile empowerment and IoT.  We are, as I’ve noted above, in the process of using the cloud to better couple mobile and web devices to the data center.  While this is helpful in many applications (sales and support, notably), it still isn’t addressing the fundamental work-to-IT relationship, which requires point-of-activity empowerment, a new model for collaboration, and the coupling of “field sensor” information to allow an application to “see” the work environment as well as the worker, and make decisions on what would be most helpful.

None of this is rocket science, folks.  The problem is, I think, that we as an industry have gotten focused on easy money.  It’s almost impossible to get venture support for an “infrastructure” startup, and if we took a popular hot-button technology like AI to ride to funding success, the company would then be pressured to forget this long-term productivity and point-of-activity crap and focus on something that could be linked to social media.

This is what one of those insightful software giants could fix.  Big, established, companies can take a longer view.  IBM was once the master of this, in fact, and Red Hat is one of those insightful software giants (VMware, you’ll recall, is my other candidate).  Even Microsoft might take up the banner here, but until some firm moves the ball on productivity support from IT spending, we’re going to be in a tech market that’s in consolidation mode.  The other would be more fun, so why not go for it?

Surveys and Research: How Deep is the Ocean, How High is the Sky?

Why is it so difficult to find accurate predictions on, or even current statistics on, technology evolution?  One of the biggest complaints that CIOs presented me with in the last two months is that they don’t get as much useful planning information as they need, and most think they get less of it than they used to.  It’s not that we don’t have plenty of numbers, after all.

The biggest problem is what we might call utility bias.  My own work in this space over decades generates an interesting statistic; among sellers, virtually every research purchase is to buttress their marketing position.  Among buyers, the primary goal of buyers of research material (by a whopping 73% margin) is to validate a position or course of action that they’re advocating for their company.  Only 15% look for a truly objective assessment the space involved (the rest have a mixture of motives with no dominant one).

Years ago, I got an RFP from a big research firm, and the mission (paraphrasing) was “Develop a report validating the one-billion-dollar annual market for widgets.”  An analyst in the firm told me the number came from research they’d done, showing that was the optimum market size for selling reports (I didn’t respond to the RFP, by the way).

Suppose that you want to figure out what’s actually going on.  Are there any steps you can take to weed out the chaff?  There are no guarantees, but there are things you can do, and look for.

The practical step is to place the research in the broader context of the tech market.  For example, if a report says that there will be a billion dollars’ worth of IoT sensors sold, you need to know two things.  First, what exactly is the report considering “IoT sensors?”  Second, what is current spending on the macro-tech market the target space is a part of, which is industrial control, for example.  If the forecast for a subsection of a market is a very big piece of current total-market spending, then you should be suspicious of the massive business case that would be needed to drive such a change.

The next signal is reliance on predictions of or reports of buyer adoption of a technology.  We always see citations like “over 50% of buyers plan to adopt….”  There are even more direct claims; “A third of operators report they’re deploying….”  These kinds of statistics are susceptible to three distorting factors.

Factor one is that the surveys talk to the wrong people.  I spent an enormous amount of time and effort when I started doing buyer surveys, just identifying the right kind of person to talk with and then finding examples of that job type that would actually talk with me.  Sustaining a survey base over time is very difficult, and what tends to happen in the real world is that research firms have a stable of respondents they can count on, and tend to use repeatedly.  I was asked by a company to assess, confidentially, a research report created for them by one of the top analyst firms.  They were doubtful of the conclusions, and when I audited the survey process, it was clear that only about a quarter of the people who were surveyed even had a loose job connection with the technology.  Most worked at companies who could never have consumed it, so the survey base was wrong for the survey.  Be sure you know who’s providing information for research; require a profile of the base.

Factor two is that nobody wants to look like they’re a luddite.  Call somebody up and ask them if they’re using the hottest technology in the IT or networking space, and see how many will say “Yes” even though they’ve never even considered it.  I remember a survey done by Bellcore on ATM (remember that?) which asked whether people were using it.  They were surprised when the trial-audit run of the questions revealed an almost-70% penetration of ATM, which was at the time in its infancy.  I asked to see the transcripts of the questions, and in almost every case, the respondents asked what ATM stood for.  That’s a bad enough sign, of course, but when the acronym was decoded, the most popular comment was “Yes, I use 9600 bps ATM.”  Obviously, they got no further than “asynchronous”, and were talking about modem technology.

A more recent example in the Ethernet space shows this issue is still common.  A publication wanted to see who was using 40G Ethernet in their data centers, and came back with the astounding fact that 40% of their users reported they were.  At the time, no 40G products were even on the market.  Hey, ask me anything, their users seemed to be saying; I’m on the leading edge of tech!

Factor three is that even the savviest enterprises are unable to project future spending trends.  I did a ten-year correlation between what my survey base told me about their plans for a technology deployment, and what was happening three years later.  By the half-way point in the decade, there had ceased to be any statistical correlation between what they said they would do and what they actually did.

Any enterprise CIO will tell you that their focus on budgeting is for the coming quarter, then the current year, and beyond the year-ahead budget planning early in a given year, they don’t look ahead much.  The business can do little with those kinds of far-future forecasts, and so they don’t produce them.  Might a CIO or other senior IT type speculate on the future?  Sure, but apparently with little success based on my assessments.

Network operators have much longer capital cycles, but they’re not much better at forecasting things.  A good metric to prove that point is that for the last 30 years, the percentage of lab trials that turned into actual deployments of new technologies has hovered in the 15% range.  I’ve seen many examples of technology that were put into trials, even field trials, and even limited production deployments, that failed to gain any significant market traction.  Adoption doesn’t mean success, because the scale of adoption determines that.

Which brings me to the final point, which is that even where statistics are correct, they can be highly misleading.  The best example of this is the tendency of research reports to talk about the number of customers who have adopted something, versus how much was consumed.

First, if every prospective buyer purchased a widget, would that constitute 100% adoption?  Bet at least some research would claim that, even as the addressable widget market might be thousands of units per buyer.  The best information on adoption would compare current purchases or usage with a projection of either the expected current market or the total addressable market.

Second, the market tends to jump on a technology concept that got good ink, widening the definition of a term or product scope to far beyond the original.  IoT is a good example of this.  When the concept was introduced, it was about “public sensors on the Internet” available for general use.  I just saw an ad for a home automation product introduced at CES, and this product fit a niche that was already served when the IoT term was introduced.  Clearly that niche wasn’t, at the time, considered “IoT”, yet this product was ballyhooed as an IoT advance.  Bracket creep strikes again!

Everyone loves data, data that can be presented as a factual foundation for business decisions.  Reports, surveys, and predictions are likeable data.  They’re just not particularly accurate, which means that if your goal is to get the right answer, you may have to spend more time looking for it than you’d like, if you can get it at all.

CIMI Corporation used to produce syndicated research reports, based on a survey base of 277 users, 77 network operators, and a computer model that was designed to predict the way buyers made purchase decisions.  The numbers were pretty good in accuracy terms, but who wants a forecast of a million dollars in sales Year One when you can get one ten or a hundred times that?  I got out of that business in the ‘80s.

It’s still possible to get reliable data these days, and some of the research that’s published is in alignment with my own model, which I still use along with data people send me to generate my commentary on the market trends.  It’s not easy, though, and there are enough factors that muddy the waters that you should take every number you see with a grain of salt—mine included.  Only predictions that turn out to be true, or surveys that are proven to reflect real conditions, will give you a true picture of conditions, so you have to gravitate to sources that give them to you if…if you want the truth.

Signs of Some Trends that Could Shake Networking

One Street research firm has done a particularly good job wading through the hype in tech to come up with some insights into 2020, and I think they’re worth exploring.  Rather than repeat their themes, let me group them into what I think are major trends.

The first theme is that infrastructure-building will lag service-building.  We tend to look at advances in networking as being the result of changes in (meaning, in most cases, increases in) infrastructure investment.  Look at 5G as an example.  What the research is suggesting is that what we think of “infrastructure” is less important than the specific service goals, which may result in investment in things outside the usual scope of “network infrastructure”.  A “service” here is the thing that creates direct value, rather than something that’s plumbing or resources needed to build up to that value.

In networking, infrastructure is about traffic.  Network spending has tended to be driven by traffic growth and by the need to improve things like security using overlay technologies like firewalls.  At some point, the infrastructure reaches the point where generalized improvements don’t really justify their costs, and so you have to look at specific applications/services and their needs.

A good example of this is the impact of streaming video and other OTT services.  We tend to think of the Internet as a big end-to-end IP network, but it’s really a series of metro/local delivery areas with extensive caching to support better quality of experience, linked by a wide-area component.  As streaming video increases, the focus of investment lies in the delivery part, which will grow much faster than the core.  Thus, CDNs and perhaps, with the proper service driver, edge computing take precedent in investment over big core routers.

The increased service focus also tends to separate out applications like unified communications and collaboration (UC/UCC) from traditional in-house implementations to a cloud-hosted, as-a-service model.  Specialized applications tend to consume specialized equipment, and often it’s difficult to achieve any reasonable economy of scale in these spaces because of that specialization.  In addition, support for the gear requires different training and perhaps even different teams.  UCaaS is a logical response to this sort of thing; turn these specialized applications for networking into services and outsource them.

All this suggests that we may see more SaaS-style offerings, attempts to build even cloud services higher on the value chain.  We’re certainly facing a CIO crisis of uncertainty with respect to how to get technologies that they accept as useful, even inevitable, into a form that businesses can actually gain value from.  I had a record number of CIOs contact me in December and January on this issue, and it’s the source of the growing interest in the “Kubernetes ecosystem”.

Seen in this light, things like 5G, IoT, and edge computing are at risk in 2020, because they’re essentially technology enhancements to infrastructure in search of a service mission.  Every piece of gear that gets deployed in a network has three negative effects.  First, it consumes capital budget.  Second, it adds to the support burden and costs, and finally it drives a depreciation stake in the ground that constrains choices on equipment because of displacement risk.  The insight the research contributed here is that “build it and they will come” ignores the cost of waiting till they do, which may be a long time, if ever.

It’s not uncommon in our hype-driven technology space to be pushing technology revolutions when the benefits and business case are virtually invisible.  It’s easier to write about edge computing than to write about the hundreds of things that might consume it, and the complicated issues associated with when those possible uses add up to utility.  Selling a box is also easier than selling a mission, which likely will involve a bunch of customer-specific research and analysis that no salesforce wants to get involved in.

We could say that the adoption of the popular revolutionary technologies depends on that first point, the thinking about services rather than infrastructure.  If you consider this point a moment, you can see that cloud success depends on the presumption that the service of hosting can be disconnected from servers.  The services of networking can be similarly disconnected, and that means that consumption of architected, specialized, services is likely to become a priority with users, as the first trend point suggests.  That, in turn, shifts the focus of the network from what it moves to what it benefits.

This shift favors a transition from connectivity services to managed services and then on to application/mission services.  That poses a major threat to both the network operators and their vendor community, because operators have been exceptionally inept in defining higher-level services of any sort.  We would not have a global community of managed service providers (MSPs) if operators globally had thought about the inevitable trend toward offloading support and operations functions.  Specialization in services further complicates their situation by shifting buyer power more toward line operations people, who aren’t infrastructure buyers and don’t want to be.

The final thing I think the research is telling us is that mass use of technology can only happen if tech is dumbed down.  I used to say that large organizations were masters of harnessing mediocrity, because mediocrity was the only thing you could get enough of to build a large organization.  The same thing is true for technology adoption, which means that the complexity of network and IT operations grows with the scale of its adoption, and the labor pool and skill pool available to support it does not.  At some point, you can’t “build it” and let them “come” because you can’t build it in the first place, and can’t keep it running while they’re on their way.

This might be a signal to the entire tech industry, at least the part that sells to businesses rather than consumers.  We’ve spent decades thinking that you advanced your company’s technology by buying gear or software, and now we’re starting to think that individual missions drive mission-specific changes.  That means that there has to be a strong model of technology-building based on fulfillment of missions, or you have to shift to an as-a-service plan.

This is going to have a profound impact on vendors in 2020, if it’s correct.  For the enterprise, applications take precedent over infrastructure, and for the service provider, services take precedent over traffic.  The challenge in both areas is two-fold.  First, you need (as a vendor) to own the driver for spending, which means you have to move your sales process up the value chain, along with what you sell.  Second, you need to be able to harmonize mission-agile top-end offerings with infrastructure at some point.  Services, down deep, become traffic.

I think that “abstraction-think” is the best way to do this.  I also think that the software industry, including at least some cloud providers, are doing a better job at framing the future than the hardware guys are.  The onrush of what I’ve called the “Kubernetes ecosystem” is an example of vendor attempts to build an agile software platform for cloud-ready applications, and by doing so define the properties of those applications.  This work is probably going to be the most important thing in IT in 2020.

For networking, things aren’t so clear.  Network operators have resisted getting involved in “higher-level” services, and their desire to prevent vendor lock-in on infrastructure purchases has limited their ability to think above the box-level infrastructure planning they’re used to.  Feature planning is the foundation of service planning, not network-building.  Operators have an opportunity, in addressing this seismic shift, to gain some traction in the service and value side of the network, but they have a history of fumbling their way to defeat on this sort of issue, leaving things to the agile OTTs.  I think that if this happens now, we can forget the idea of operators ever being more than plumbers, and in many areas where demand density limits network connection profitability, we may see further need for government intervention just to keep the wires humming.

Is an All-Router Network the Best Path to Universally Useful IP Services?

What kind of IP do we need?  Overall, the Internet is created by what we could call “evolutionary IP”, a product of decades of changes and refinements through the well-known “request for comment” or RFC process.  Many of these changes were once considered promising and have now been abandoned.  At the same time, we’re in a situation where most CIOs I’ve talked with say they’ve spent more in 2019 on things like security add-ons to correct fundamental IP/Internet flaws than on IP networks.

Ciena proposes another model that they call adaptive IP.  The exact differences between this and normal evolutionary IP are hard to glean from the company’s material, but the overall theme of the material is that IP has to shed the stuff that it no longer uses and get to a lean-and-mean structure, then adopt some strategies that have been available all along to further simplify IP.  One such strategy is source routing, where the originator of a packet (the actual source or a point of transition along the path) appends the forward route to the packet as a series of headers.

I’ve worked extensively with source routing, and it surely has plusses (and minuses), but I think that it also illustrates an important and basic truth about IP and the Internet.  That truth is that there’s a lot going on inside that’s really an attribute of implementation rather than of application.  Much of this has accumulated in an effort to come up with an open set of interfaces within an IP network that would ensure competition among vendors as opposed to classic vendor lock-in.  To me, the big question is whether “adaptive IP” or any other form of vendor-advocated IP really addresses the baseline problem, which is that difference between implementation versus application attributes.

The best way to clear up muddled situations is to start at the top, looking in from the outside.  An IP network is a classic, abstract, “black box” to a user.  They see and use what enters and emerges at their own connection.  Inside the network that offers that connection, there are a lot of other things going on, but those things are requirements of implementation and not application.  Users push packets in and draw packets out, decode URLs to IP addresses, and perhaps (only perhaps) use a few control packets like “ping”.  We can thus define a different IP network model, which we’ll call abstract IP, that exposes only the properties actually exercised at the connection points.

At the user level, abstract IP is very simple, which illustrates a basic truth: if we had IP to do over again, we could rebuild the internals completely and still use the same applications and user connections, providing we designed our abstraction to fit that requirement.  A related truth is that if we had a “subnet”, an IP network community that interfaced with other communities inside the Internet or another IP network, we could define an abstraction that resolved what those other communities, as virtual users, needed from our new abstract subnet.

Google demonstrated this a long time ago with their SDN core.  They surrounded it with a series of “BGP emulator” instances that presented what BGP partner networks would expect to see, and inside that ring they were free to do whatever worked to move packets optimally around.  As I once said in a conference, “It’s fine if routers use topology exchanges to guide packets.  It’s fine if Tinker Bell carries them on little silver wings.”  Inside the black box, anything that works is fine, which is what we should take as our baseline requirement for IP network modernization.

There is a huge, hidden, cost associated with demanding box-for-box interchangeability within a network.  You have to pick specific internal mechanisms for routing and status exchanges, because every box might belong to a different vendor.  Many doubt, Google included, that the benefit of open competition for boxes offsets the cost of being forced to adopt consensus feature sets at the box level.  In an open market, though, it’s the overall openness that protects operators from lock-in, and the penalties associated with requiring box-level interchangeability aren’t justified at all.

IP networks have embraced this approach implicitly in the past.  The Next Hop Resolution Protocol (NHRP) was defined to allow virtual-circuit networks like frame relay and ATM to move IP packets, by defining how they’d pass packets to the right edge points on a “non-broadcast multi-access network” or NBMA.  Obviously, the same principle could be applied to define an interface between a legacy/evolutionary IP network and any other network that could present a suitable IP interface.  And, I’d point out, ATM and frame relay used source routing.

My concept of abstract IP says that any strategy for moving IP packets that can satisfy the interface requirements to adjacent users or network elements is fine.  An “abstract subnet” then looks like a virtual device that’s compatible with its neighbors.  Google created an abstract BGP subnet.  Back in the days of old, Ipsilon proposed to have edge devices recognize “persistent flows” and route them on ATM virtual circuits to their destination, and that would be an acceptable implementation of an abstract subnet.  So would source-routing in Ciena’s Adaptive IP.

Is the “black-box substitution” test enough to validate an implementation of abstract IP, though?  There are a lot of really inefficient and ineffective things about evolutionary IP today, but it’s easy to see we could easily create even worse things by accident.  There’s an implied value test, then; abstract IP has to offer some value over its evolutionary alternative.

That means that the implementation of things like flow bypass (Ipsilon) or source routing (ATM, frame relay, Adaptive IP) has to be better than just replacing the adaptive subnet with actual IP routers.  Google obviously met that test.  Arguably, Ipsilon did not, since its approach failed, having been replaced by an all-IP MPLS strategy started by StrataCom and acquired and developed by Cisco.  The key point is that the “MPLS abstraction” doesn’t really replace IP at all, and thus could be said to fail the value test.

SD-WAN is, in a sense, the modern example of an attempt to define abstract IP, and of course so is Ciena’s Adaptive IP model.  Whether either is inherently valuable depends on what’s inside the abstraction.  Does the abstraction deliver benefits that a traditional evolutionary IP implementation would not?  Does the abstraction offer simpler, cheaper, implementation?  If I build my abstract IP out of what are just my own routers or router instances, and if I offer no distinctive incremental value to traditional IP, I’ve not really moved the ball much, if at all.

We can use MPLS to create a kind of inside-IP implementation of lower-layer features.  We can use virtual pipes created below IP, using any technology, to provide a virtual underpinning to an IP network that creates the effect of full meshing, with some changes to IP to help it scale.  We can absorb some IP features into that lower layer.  There may not be any single right answer to which is best, which is why I think that we should think first about allowing for the Google-like abstraction of pieces of IP network, abstractions that preserve necessary features and give us a more open set of implementation options.

I think the biggest things missing in talk about “non-evolutionary” IP in any form, is a discussion of these points: What is the abstraction intended to connect with, what are the incremental features presented at the connection point(s), and what is the specific implementation within, including technical requirements for the elements.  Certainly, these points are critical in describing how SDN or packet-optical overlay networks could replace or simplify IP networks.

They’re going to get more critical, too.  Cisco now talks about the P4 flow-definition language for Silicon One.  Ciena talks about Adaptive IP.  We are, both as users of IP and producers of IP networks, starting to look at something we’ve not explored much since those ATM and NHRP days—how do you look like IP without actually being it.  It’s a great thing to be discussing.