November 2021 – Welcome to CIMI Corporation's Public Blog

There’s a Street View Cisco is Just Marking Time. Correct?

Those who read my blog know I follow Wall Street’s views carefully, so you’re not likely surprised that I want to reflect on a recent Street-centric analysis of Cisco, run on Seeking Alpha. The title, “Cisco Systems: More of the Same” sets the tone for the piece, and what we need to reflect on is first, whether the piece makes fair points, and second what it might mean for the industry if it does.

The basic problem for network equipment is that everyone who owns a network is trying to reduce what they spend on it. The product of “the network” is pushed bits, and pushed bits are a commodity. To operators, they offer shrinking ROI. Users want them to be both free and invisible. This alone is enough to make network equipment vendors hunger for something to sell that could offer durable, even growing, revenue and margins.

The basic problem for a market leader is that you really have nowhere to go but down. Everyone is chasing your current customers; all your competitors are united by their desire to wipe you off the face of the earth so they can grab the share of the market you control. It’s no wonder that market leaders are always keeping an eye out for something else they can dominate.

Cisco, obviously, fits both these problem-points. They also have a few Cisco-specific issues that impact their response to these problems.

The first of these issues is Cisco’s historical reluctance to rock the boat. No market leader likes to do things that could throw their current base up for grabs, but Cisco has been reluctant even to exploit related market/product areas. “Fast followership” trumps “market leadership” for Cisco.

Market leadership can lead you astray. Jump out in a new area, and if you’re Cisco you’re using your market credentials to validate a new space that, once validated, will be jumped on by competitors. Competitors who have the advantage of seeing what you’ve done and counter-punching. Wait for somebody else to take the risk, then, if the new area really shows promise, use your market might to crush them like a bug.

The second issue, almost surely in part a result of the first, is that Cisco is relentlessly sales-tactical in their thinking. “Sales” because the sales organization has always exercised more influence over senior management than engineering or product management. “Tactical” because Cisco thinks and plans a quarter at a time. Make your quota, make your numbers, show the Street something on the next earnings call. Something that takes a long time to bear fruit either has to be hyped into a shorter-term impact, or pushed aside.

Cisco used to be famous for meeting a new market interest with an announcement of a “five-phase plan”, and they were always in Phase Two when they made the announcement. This was a clever way to define a solution to that new market interest, but at the same time to show that getting to it was a process that had intermediate steps, that the process would take a long time, but that you could buy a step toward it right now from Cisco.

The summary line from the piece I’ve cited says “Cisco continues on the same path, pursuing bolt-on acquisitions to deliver on minimal revenue growth.” I think that’s a fair comment. Cisco has purchased companies but not to change their fundamental strategic position, but to augment revenues in the near term and give their sales force something new to sell. They’ve also, as the article suggests, done things like buy back their stock to keep earnings-per-share growth up while earnings were between down and modestly better.

I think Cisco’s “disaggregation” moves, it’s shift to “recurring” “software” as a focus, are also examples of a typical Cisco response. It placates the Street, it generates a new revenue source, but it avoids that uncomfortable market leadership role in some new area. Of course, you could wonder whether the current market conditions constitute a “new area” at all.

Operators have been talking about revenue per bit problems for two decades. In that same two decades, enterprises have shifted from a cyclical shift between simple modernization and advancing their technology, to hunkering down on the former of the two. In the past, major upswings in IT and network spending have come in distinct cycles, each lasting about 15 years from start to finish. We ended the last cycle in 1999, and none has come along since. That means that we’ve been in a modernization/consolidation mode for enterprises for over 20 years.

What drives investment by network operators and enterprises alike is ROI. You justify a project with some new benefit set whose value exceeds its cost. Once you’ve brought that gear in, sustaining it is just a matter of spending as little as you can for as much as you can get. Only new benefits will drive spending growth, and both operators and enterprises have been looking for new benefits for decades. I recounted both these points to Cisco in the early 2000s, so I know they at least knew of the situation. Market leadership would demand that you address them, but of course fast-followership poses no such requirement.

As the article points out, Cisco’s current financials are boosted by easy comparables, by the COVID mess that created the lowering tide that lowered the boats of the entire networking industry. Maybe new variants won’t stall a recovery and we’ll come out of the slump. Certainly, up until Omicron, the network buyers had been loosening their purse strings, which is why Cisco has an order backlog. Backlogs clear, and comparables eventually have to compare with upticks not downticks.

Vendors have to drive new benefits, through innovative ways of applying technology to improve buyers’ business models. A lack of innovation creates the current technology-buyer stagnation. Buyers don’t sit around figuring out what they could do if they had stuff that no vendor is offering them. Particularly if the new things that would jump-start benefit creation for networking are big changes, complex ecosystemic stuff that’s hard to explain and much harder to invent. Cisco, as a vendor, could be innovative, but so could all of their competitors.

At least two network rivals seem to be trying to polish up their crystal balls, and invest in what they see. Nokia seems to be taking a more open-model view of networking, accepting that the basics of pushing bits or lighting up cell towers with RF is a commodity, and that you need to manage costs accordingly. That leaves them free to think of what’s bigger and better, but we’re not yet hearing Nokia talk about what that might be.

Juniper, in some ways the arch-enemy for Cisco, has taken the M&A game Cisco is said to have played for time and near-term revenue in the exact opposite direction. Their acquisitions have been decidedly (and, for Juniper, uncharacteristically) strategic, and they now have the technology to support a vision of network evolution that’s as close to what I think we should demand as we can get. As I’ve said before, they also have had an articulation gap that limits their ability to leverage what they have.

The article is fair to Cisco in one sense; it presents what I think is a reasonable picture of Cisco’s behavior. It’s unfair in one sense too, in that it doesn’t note that Cisco isn’t the only one that’s doing “more of the same”. The industry has been marking time for two decades, and if vendors, service providers, and enterprises want something different, specific measures will be required.

Supply-side networking says “build it and they will come”. Cisco, suggests the article, has said “Come, and we’ll tell we’ve built its essential precursor, or maybe we’ll buy whoever really built it”. The right answer is “Speak the future, your vision, to the planners so they can plan to consume your products”. Tell people what you’ve done, where you’re going, and how that will create the benefits that justify buying your stuff. That’s what every vendor, every operator, in the industry needs to do, and what some may indeed be doing.

Network innovators need to look over their shoulders, though. They need to pursue their innovation with determination and speed, because the Cisco fast-follower freight train is right behind them, and the Cisco strategy may end up being right for them, after all.

Ericsson Buys Vonage: What???

The news that Ericsson was buying Vonage took a lot of people by surprise, me included. The general view of media and analysts was negative, and many people took this as a signal that Ericsson was planning to get into the VoIP service business, something Vonage started with and is still known for. I think the reality of the deal is a lot more complicated, and a lot more interesting, so let’s take a look and see why that is.

Ericsson hasn’t been shy in saying that their purpose in the deal is the Vonage Communications Platform software, which has a good set of APIs for advanced unified communications and collaboration services and a strong developer program with broad membership. This is important to Ericsson because it ties into their ongoing effort to make 5G into something more than just an evolution to traditional mobile services.

Nobody, including me, has ever doubted that 5G would deploy. The question has always been whether it deployed into nothing more than evolved resource to support traditional mobile services. If that’s the case, then it won’t generate any real incremental revenue to operators, and there will be continued price pressure on the infrastructure needed to support it. That hurts vendors like Ericsson in two ways.

The first, and most obvious, way is that operators will be slow-rolling full 5G features and putting discount pressure on vendors to hold down costs. We’re already seeing this effect in the fact that the so-called 5G we have to date is what one operator wryly called “half-5G”, meaning 5G Non-Stand-Alone, where 5G New Radio is combined with LTE core facilities. 5G core is an extreme rarity these days, and vendors who support it are concerned that the slow uptake will not only cut their near-term revenue, but maybe even limit the scope of their opportunity for 5G-specific gear in the long term.

The second reason, which is less obvious, may be more important in understanding the deal. Operator fears about lock-in and high costs has turned many to the open-model 5G, to O-RAN in particular. An open-model network for 5G is certain to put price pressure on proprietary elements, which is bad. It is also likely to spawn an open framework of service innovation, services built above the 5G specifications and outside the normal scope of influence and knowledge of vendors like Ericsson. That focuses operators on things Ericsson doesn’t even offer…till recently.

There is a value in Vonage’s communications platform; it’s a good framework for offering collaborative services to enterprises. The problem is that it competes with giants like Microsoft Teams, Zoom, and Cisco’s Webex. Ericsson may hope to bring a full collaborative application framework to its product inventory, from which it could offer the APIs and development program as a means for operators to build their own toolkit and compete with the web giants. Realistic? Not likely, at least in my view, and Ericsson’s stock took a hit on the announcement, which suggests Wall Street doesn’t see the obvious story as a good one.

I can’t think of a single incident where a telco successfully competed with a web incumbent for a higher-layer service. I can’t think of many recent ones where they even tried. But could Ericsson be less interested in Vonage’s communications platform for UC/UCC applications than as a means of creating advanced 5G communications services? This could be the voice counterpart to Ericsson’s previous Cradlepoint acquisition, a play on the wireless edge for branch office networking. Ericsson might see operators offering an enterprise-specific service suite via 5G and network slicing. The goal could be to validate both private enterprise 5G, and enterprise-specific 5G services, more than to advance generalized collaborative services.

This would be a credible goal for Ericsson, but it still raises questions. First, could Ericsson make it work? Second, would it address the two “hurts” that Ericsson is suffering in 5G. Third, did Ericsson pay more for Vonage, and even Cradlepoint, than they should have?

Ericsson says they intend to keep Vonage as a separate unit, but that doesn’t mean that Ericsson wouldn’t influence the way Vonage did business. If they didn’t, in fact, they’d really be buying Vonage revenue at a premium over its share price. If they do meddle, we’d have a telco equipment vendor trying to tell a web company what to do, which is another of the things that have historically never ended well. However, Ericsson really doesn’t have to push either Vonage or Cradlepoint in a new direction, just position their current direction in a 5G flavor.

That direction, enterprise private 5G or operator-provided, enterprise-specific, 5G services, is one that all telcos and telco vendors have come to love, not so much because of its proven potential to pull 5G through in full, countering the first of my two Ericsson hurts, but because it hasn’t yet proved incapable of doing that. In our hype-driven industry, every operator and vendor jumps on the most credible story and stays with it until it’s been demonstrated to be over-hyped. Then they move on to the next one. The alternative is to admit to the Street that they have no strategy.

There is some enterprise private LTE in one form or another, but I don’t see a clamor for the 5G successor. At least not in the volume needed to justify a couple of acquisitions with a combined price tag in the billions. There is an opportunity for specific 5G services targeting enterprises rather than consumers, and surely there’d be less price pressure on this sort of service, but is the opportunity big enough to really matter, and can Ericsson exploit it?

The big problem for Ericsson, though, is the second hurt. Open-model 5G could introduce web-scale innovation to the same market that Ericsson hopes to take over using proprietary technology. Yes, Vonage has UC/UCC-compatible APIs, and yes, those could be valuable to enterprises and to firms who want to sell enterprises private infrastructure or specialized services. What might a web-like innovation war do, though? First and foremost, it could validate a set of APIs of its own, and open-source software that makes use of them. Second, it could then quickly create a community of developers bigger than Vonage’s. Third, it could extend the notion of “higher layer” services way beyond either Vonage’s or Cradlepoint’s vision.

Nokia may be the embodiment of the open-model 5G risk to Ericsson. They’re a highly credible mobile infrastructure vendor with decades of engagement with operators, and they’ve jumped out to be a leader in the whole open-model space, including O-RAN. In fact, there are some in the O-RAN space who are concerned that Nokia poses a threat to the “openness” of the movement because they’re a kind of establishment choice in the open-model space, and there have been stories that have worked to put Nokia in a bad light. If Nokia poses a threat to open-model 5G, they pose a bigger threat to Ericsson if they can gain ascendancy there.

I’m not saying that Ericsson is doomed to fail with this deal, only that I think it’s a doubling down on the enterprise-specific 5G services notion that’s been problematic from the first. Maybe they don’t think they have any choice. Ericsson is more dependent on maintaining a proprietary model of 5G than any other 5G vendor, and that vision can’t survive a big success with open-model 5G, period. That there’s no choice but to do something doesn’t make the “something” easy to do. This is going to be hard, very hard, and the consequences of getting it wrong just got a lot more dire. If Ericsson knows that and has a strong strategy to mitigate the risk, reduce those consequences, this might turn out to be a good idea.

I’m waiting to see.

Electrical Utility Entry into Broadband Could Impact More than We Think

The problem with the notion that competition could spur broadband deployment to the under-served, even with the (modest) stimulus of the infrastructure bill just signed, is that broadband just isn’t all that profitable. The majority of the initiatives we’re seeing are targeting not so much “under-served areas” as “under-served micro-areas”. Even in rural areas, there are pockets (communities) where demand density is high enough that, with subsidies, it could drive even fiber deployment. Picking off those communities is one strategy to increase broadband access, but there may be another, better, one.

There are two problems with broadband as a business. One is that the profit margins are low, even in areas with decent demand density. The other is that there’s often a substantial cost (“first cost”) associated with getting enough infrastructure in place to even offer to connect customers. High first costs mean you have to be a lot just to enter a market. The other is that different technologies have different “pass costs” and in areas of low demand density, it’s common to avoid the technologies with the highest pass cost.

The telcos worldwide tend to be adapted to low margins; they were once either government elements (PTTs for “Postal, Telegraph, and Telephone”) or public utilities and protected monopolies. If we wanted to look for players who might want to enter the broadband space, it would be logical to look at other public utilities. These companies have rights of way to install stuff. They have connections to homes and offices, field support personnel, call centers for customer support, and so forth. They also tend to be “cash flow machines”, generating a lot of cash and benefitting from very low borrowing (bond and loan) rates. The electric companies, in particular, seem a logical source of new competition.

Utility entry into broadband has been controversial from the first, and it still is. Telenor recently threatened (or postured) that it would start selling electricity to customers if electrical utilities entered the telecom market. Going back almost 20 years, there was talk about broadband-over-power-line, and there were some initiatives that actually got to the market, but the technology couldn’t compete with fiber or CATV cable, and eventually even cellular service. That’s not the focus today.

Electric companies have polls and people who climb them. That reduces their cost of installing the initial infrastructure (passing). Others, in many areas, are already stringing broadband cabling on power polls, in fact. They also have funds, attractive cost of money, and other assets that most broadband competitors don’t have. So could electric utilities solve the problem of under-served areas? Not so fast.

Residential demand densities relate to household density, which varies significantly. In urban settings, there’s no question that you can serve business and residential customers profitably. A tightly packed suburb might have 50×100 foot lots, which could be packed, with street right of way, to roughly 8 lots per acre. A more distant suburb would typically pack no more than 2 lots per acre, and in near-rural areas the median household density is down around 1 lot for every four acres. But put in terms of “passes”, a mile of PON fiber would pass a hundred tight-suburb homes, roughly 35 homes in the deeper suburbs, and 12 homes in near-rural areas. Go deeper into the rural zone and you could lay a mile of fiber and pass one customer. Most operators tell me that fiber deployments to areas with less than 16 homes per mile of “passing fiber” would present an ROI challenge, and less than 10 would probably not be considered without some subsidization.

This is where electrical utility economies might come into play. One mid-western utility told me they believed they could profitably run fiber to areas where a mile of fiber would pass only 12 customers, and possibly as low as 10. Add in some subsidies and incentives, and the number would drop to 9 to as low as 5. This presumes that the utility would sell fiber broadband and offer streaming live TV through a relationship with a provider. Pull any realistic TV revenue out and the numbers all go up by one or two households/mile. Add in business services, home security, and other services, and they could go down by that same one or two.

Some electrical utilities have offered broadband for a long time; I worked with one that offered business broadband a decade ago. Those who have offered, trialed, or studied it in detail tell me that they’ve determined the key issue is opex. The benefits of reusing facilities and rights of way, and the low ROI requirements of utilities, are surely a benefit, but they can be eaten up by customer care in particular.

One utility told me that a trial they did showed that broadband Internet generated an astonishing twenty-two times the number of support calls as electrical power did. They also learned that unlike power, broadband is difficult to assess remotely, and that there are conditions that render a broadband connection useless that have nothing to do with the broadband service itself. Their conclusion was that they needed a broadband-specific customer-care portal, a browser-based tool that could help them diagnose Internet problems, and an app that a user could install on a mobile phone and use to interact with support when the Internet connection was down.

One thing I found interesting in the comments made by utilities was the extent to which traditional telco and cableco ISPs agreed with the points. Users of major US ISPs report problems with broadband Internet support that sound just like the problems that electrical utilities want to avoid, yet most of those ISPs have not adopted any of the measures I just cited, and many haven’t seriously considered it. Some will admit that they’re deterred by the cost of the new measures, reasoning (probably correctly) that customers aren’t leaving now over these issues, and therefore probably won’t leave in the future.

Utilities may be in an interesting position with broadband. They not only have some financial, outside plant, and workforce advantages they could exploit, they can draw on the negative experience of early ISPs and address issues those ISPs have faced but have failed to deal with. They might be able to create a customer care framework that would not only improve support responses, but also reduce churn, which is actually associated with the largest component of opex.

A utility focus on opex and, in particular, customer care might also impact the seemingly endless stream of website, DNS, and other above-the-network problems we’ve seen. Consumers can’t distinguish between a broadband problem, an Internet problem, and a site problem, and some believe that ISPs have been slow to offer strong customer care tools because it would likely involve them in problems that aren’t theirs to fix. Maybe they aren’t, but the ISPs have influence on the rest of the structure of the Internet, and the rest of the players. Might a customer care revolution at the ISP level, driven by utility entry into the market, end up creating a broader pathway for Internet users to learn about problems, and a broader incentive for everyone to get their act together? It could happen.

The utilities’ focus on opex might be especially appropriate for ISPs today, given that the growing interest in and dependence on streaming video erodes the stickiness of one of the things that wireline ISPs have depended on to acquire customers and reduce churn. Could customer care become a major differentiator, an attractor, and a way of retaining customers? On the surface, it seems like it should be possible, and it may be that utility broadband, if it does grow, will show the way.

Rethinking the Very Nature of Infrastructure

It probably seems to be a silly question, but what is a network, these days? What is a cloud, a service? We’re seeing a series of technology shifts and business changes that are blurring traditional boundaries. At the same time, we seem to be unwilling to look for a new playbook to describe what’s happened, or happening. That puts a lot of capital investment and revenue opportunity at risk, so today we’ll try to sort some things out.

Starting at the top is my favorite approach, and the top of everything now is the notion of a “service”. A service is a useful, valuable, and so billable capability that is delivered in the form it’s expected to be consumed, not as a toolkit to build things that can then produce it. In computing, as-a-service technology has been applied to everything from what’s effectively hosting/server-as-a-service (IaaS) to application services (SaaS). In networking, services have focused on connectivity, but the concept of the connection has elevated from the physical level (Level 1 of the OSI model) upward to Level 2 (carrier Ethernet, VLAN), Level 3 (IP services, including VPNs, SD-WAN, and the Internet) and in some cases, even higher.

When you rise above connectivity at Level 1, you necessarily involve functions like packet processing, functions that could be performed in a purpose-built device but also by something hosted on a server. Do enough of that hosting and you have a resource pool and a cloud, and this is where computing and networking have started to merge.

Actually, they’ve already merged in a sense. IP networking demands a number of hosted functions to be workable, including address assignment (DHCP) and URL decoding (DNS) and including content delivery networks (CDNs). As we add features to improve security, manageability, and discovery, we create opportunities to incorporate these things into the “service” of IP connectivity. As we enhance mobile communications, we’re explicitly committing more “network” features to hosting, and we’re spreading hosting resources to do that.

The network’s potential use of function hosting is important, but particularly important as it relates to specific services. Hosting is attractive to the extent that it’s profitable, meaning that you can sell it or it reduces your costs. Both these attributes are difficult to achieve without some economy of scale, a resource pool. The best place to establish an efficient resource pool is in some central place where you can stick a massive data center, but that approach will put the resources too far from the user to support latency-sensitive services. Put the resource pool where it serves users best, right at the access edge, and you don’t have enough users to justify a pool with reasonable economy of scale.

Deep hosting is problematic for another reason, the challenge of personalization. At the edge of the network, an access connection serves a user. In the core, that user’s traffic is part of what’s likely a multi-gigabit flow that you simply cannot afford to dissect and process individually. Go too deep with function hosting and per-user stuff is impractical. Go too shallow and you’ve specialized your hosting to a single user.

Metro is the sweet spot for function hosting. It’s close enough to the edge that it can support most latency-sensitive applications, and it’s also close enough to allow for personalization of services because you can handle individual customers, even individual sessions where necessary. As it happens, metro is also a great place to host generalized edge computing services, and that’s creating the current uncertain dynamic between network operators and cloud providers.

The problem with edge computing as a general service is that we shouldn’t be calling it edge computing at all. The use of what’s essentially a geographic or topological term implies that the only differentiator for edge computing is where it is. That implies that it could be either the cloud pushing close, or enterprise compute resources pushing out, and either would suggest that it’s just another kind of public cloud computing. As I’ve noted before, an edge market implies that it offers something different from both cloud and premises. In the case of the cloud, it offers lower latency. In the case of the premises, it offers the scalability and resilience and expense-versus-capital-cost benefits of the cloud to applications that were on-premises because of low latency requirements. This is why I’ve said that latency is the driver of the edge opportunity.

Latency is also the source of the challenge, because it’s obvious that we don’t have any significant edge computing service deployment today, and yet we have applications that are latency-sensitive. There’s a boatload of IoT applications that rely today on premises hosting of application components to shorten the control loop. That means that “private” edge is a viable strategy. Could there be other IoT applications, or applications other than IoT that are latency sensitive, that could exploit edge computing? Sure, but what justifies it, what induces somebody to deploy edge resources in the hope of pulling those applications out of the woodwork?

That’s where the network stuff comes in. If network service components, from DNS/DHCP to CDNs, from 5G hosting to NFV, from live streaming channel guides to advertising personalization, were to be hosted in the metro area, they’d justify a local resource pool that could then be applied to general edge computing applications.

The market flap about whether network operators are selling their (edge) souls to the cloud providers has its legitimate roots in this issue. If operators elect to either partner with cloud providers in a shared-real-estate deal, or simply outsource their network function hosting to the cloud, they don’t stimulate their own edge solutions. However, it’s interesting to note that while this debate on who owns the future edge is going on, those very cloud providers are pushing a premises-hosted IoT edge strategy to enterprises.

Cloud providers do not want network operators entering the edge market, for obvious reasons. Do the cloud providers see the impact of 5G and other network service hosting requirements on edge computing as being minimal, at least in the near term? That would explain why they’re pushing to bypass the notion of edge-as-a-service in favor of enterprise do-edge-yourself. They might believe that operators’ IoT interests might tip the scales and induce operators to deploy edge services.

This issue could cloud the next issue, which from a technology perspective is more complicated. If metro is the hosting point of the future, then what exactly makes it up and what vendor provides the critical technology? There is no question that metro has to include hosting, because it’s hosted features/functions that are driving the focus to metro in the first place. There’s no question that it has to include networking, within the resource pools and connecting all the service/network elements. Obviously software will be required, too. What dominates the picture?

I think the answer is “the hosting”, because what’s changing metro is the hosting. There isn’t any metro without an edge-host resource pool, so metro is then built around data centers, and the first new network role is to provide connections within. The second is to provide connections between the data centers in the metro, and between the networked data centers and the access network and core.

If latency is critical and event processing is what makes them critical, then the hosting could end up looking very different from traditional cloud computing, which is largely based on x64 processors. It’s likely that we would see more RISC and GPU processing because those technologies are especially good with microservice-and-event architectures. It’s also likely that we’d elect to do packet processing on smart interface cards because that would make execution of those generic tasks, including security, faster in both throughput and latency terms.

On the software side, things are even more complicated. We have three basic software models in play that might be adopted. The obvious one is the “cloud model”, meaning a replica of the public cloud web services and hosting features. Another is the Network Functions Virtualization (NFV) model promoted by the network operators through the ETSI ISG work that started in 2013. The third is the event-processing model that already exists in IoT, based on lightweight OS (real-time or embedded systems) and minimalist middleware. Each of these has pluses and minuses depending on the application.

The final issue is what I’ll call “edge middleware”, the tools/features needed to support applications that are not only run at “the edge” but that require edge coordination and synchronization over a wider area. This sort of kit is essential for the metaverse concept to work, but it’s also likely to be required for IoT where events are sourced over a wider geography. It’s this edge-middleware thing, and the possibility that it will drive a different software development and hosting model, that I think is behind the story that Qualcomm believes the metaverse could be the “next computing platform”.

The metaverse angle is important, and not just because it’s generating a lot of media interest. The reality of edge computing and latency is that network function hosting isn’t likely to be a true real-time application. Much of the network signaling can tolerate latencies ranging over a second. IoT is demanding in latency terms, but much of its requirements can be satisfied with an on-premises controller. Metaverse synchronization, on the other hand, is very demanding because any lag in synchronizing behaviors within a metaverse impacts its realism, its credibility. That’s why Qualcomm’s comment isn’t silly.

These technology points impact the question of who formalizes the thing that network and cloud may be converging on. The more the technology of that converged future matches that of the cloud present, the more likely it is that public cloud providers will dominate. That dominance would be shaken most if a software model evolved that didn’t match the cloud, which would be the case if the “edge middleware” evolved early. That could happen if somebody got smart and anticipated what the tools would look like, but it might also emerge if the 5G activity promoted a kind of functional merger between the RAN/Radio Intelligent Controllers (RICs) of O-RAN and cloud orchestration.

Network vendors are most likely to be influential in that latter situation, if 5G and network features drive enough metro hosting to push a resource pool into place before separate edge applications like IoT have a chance. If a network vendor pushed their 5G and RIC position to the extreme and included the synchronization and coordination mission I mentioned earlier, that vendor might have a shot at controlling what happens at that up-top convergence point between network and cloud.

Why MEC Might be Getting Crippled by Uncertainty

OK, there’s no lack of (or loss of) interest in multi-access edge computing (MEC), but there’s also no lack of fuzziness over the right way to think about it. A recent Light Reading article, the first of four promised pieces on the cable industry’s view of MEC, offers some insights on the important question of the way network operators might play in the game. It also demonstrates, perhaps less clearly, that there’s a big issue in the way of MEC.

Any topic, however potentially revolutionary, can be exhausted from a media coverage perspective. Virtually all our tech news is ad sponsored, and that means that what gets reported is what gets clicked on. With any topic, click potential is based on news value rather than truth value, and while most people might click on the first dozen or so articles on something like cloud computing, you eventually reach a point where they stop clicking. Some of that is due to the fact that it’s very difficult to write deep stories, and that superficial things get used up quickly. Edge computing is a good example.

The problem with edge computing right now is that even potential suppliers of edge services are having a problem coming to terms with just what it might be good for, and just how it would have to be provided. These topics are complicated to cover, and the truth is that much of the stuff that needs to be communicated would be dull, dry, and targeted to a very limited audience. As a result, cable companies like those discussed in the LR piece are conveying uncertainty as much as insight.

We can divide the concept of edge computing in a number of ways, ranging from how it’s done to who offers it, and on to what it’s good for. I think most everyone would agree that edge computing has to have a distinct mission versus public cloud, and that distinct mission is latency-sensitive applications. You compute close to the source of work because you can’t tolerate delay. That leaves us with the “who” and the “how”, and I think that the “how” is likely, maybe certain, to establish the “who”, but even the “how” dimension has to be divided.

The big division on the “how” is between the vision that the edge is simply the cloud pushed closer to the user, and that the edge is architecturally different. In the transplanted-cloud vision, the features of edge computing are derived from current cloud features, placed so as to optimize latency control. Development of edge applications would follow the pattern of cloud application development, meaning that it would likely be specialized to the enterprises who adopt the model. In the edge-is-a-new-architecture vision, the edge applications rely largely on features that are not present in the cloud, or are present only in a very primitive form. Development of those features then takes away a lot of the heavy lifting in application development. Think of the vision difference as an IaaS versus PaaS or even SaaS approach to MEC.

The Light Reading story cites Heavy Reading research that says that cable companies are thinking of a “hybrid cloud” model, a term I think is unfortunate because it links edge to enterprise cloud usage, and thus really tips the vision of MEC toward traditional cloud computing features. They say that cable companies believe they’ll pick a MEC strategy (build or buy) depending on the application. But to run MEC anywhere, they have to use tools, and if there are tools only to support the cloud-centric MEC vision, then they’ve bought into the idea that their edge computing will be hosted by the cloud providers. That’s problematic for two reasons.

The first reason is obvious; if the network operators don’t build out their own edge hosting, they’re surrendering the market for MEC to the cloud providers, and they’re either a user or a reseller of those services, not a creator. If there is a decisive shift of service infrastructure from simple network switching/routing to hosted features, and if future revenues depend on even higher-level service features, then they’ve thrown away what might well be the last opportunity to stop the disintermediation trend all operators have complained about for decades.

The second reason, believe it or not, could be worse. If edge applications require features that are not part of the cloud-provider web-service inventory, then a lack of those features could impact the pace at which MEC is exploited by enterprises, and there could be a Balkanization of development models and tools that would create silos and prevent the establishment of a single model for developers to learn and operations personnel to manage.

So what are the things that “the edge” might need and that the cloud doesn’t have? I think they’re divided too; this time into charging/cost issues and feature issues.

Edge applications are latency-sensitive, remember? The great majority of latency-sensitive applications relate to supporting some real-time activity, including gaming, IoT, and even “metaversing”. Those applications are almost always event-driven, meaning that they receive some real-time stimulus and process it to create a real-time response. The traditional way cloud providers have addressed this is through “serverless” offerings, which means the users buy processing rather than capacity to process. Everyone who’s tried serverless knows that it’s great when the alternative is having fixed resources sitting around waiting for work, but terrible when a lot of work actually gets presented. The pricing model for MEC, if it’s based on serverless, could end up pricing MEC out of its own market.

This problem could be exacerbated by the fact that serverless computing’s performance (in particular, the latency) lags traditional cloud computing. Users of serverless report load-and-run delays running sometimes to the hundreds of milliseconds, which is way beyond what could be considered “low latency”. Even this problem could be worsened by the fact that edge resource pools are surely constrained by the fact that the addressable market for a given edge site is smaller than the addressable market for a cloud regional hosting point.

The technical problems are also complicated. First and foremost, many edge applications are really multi-edge applications, meaning that they involve event sources that are distributed further than a single edge site could support. What that means is that edge applications could involve the same sort of “near-real-time” or “true-real-time” versus “non-real-time” functional division that we see in O-RAN, only probably with more categories.

Applications like IoT, in its industrial or warehouse or even smart cities, are likely to be suitable for single-edge hosting of the real-time elements, but even these may then have to hand off processing to a deeper resource when the real-time control loops have been closed by the edge. We could envision the processing of events for these applications as a series of concatenated control loops, each touch point representing a place where delay sensitivity changes significantly. But while we might know where application constraints on delay change, we don’t know what the actual delay associated with available hosting points might be. In one situation, we might be able to support Control Loop A at the metro or even neighborhood edge and then hand off Control Loop B to a regional center, and in another we might have to hand it to a close alternative metro point. We need to be able to orchestrate deployments based on experienced delay versus control-loop delay tolerance.

It’s clear from this that having the potential hosting points connected through high-capacity, low-latency, trunks with minimal trans-switching of traffic would be almost essential. Metro area hosting points would surely have to be almost- or fully meshed, and metro areas themselves would also likely have to be meshed, and multi-homed to regional or central points. We aggregate traffic to places where we host functions, not to places where we can jump on a high-capacity trunk to achieve bandwidth economy of scale.

All of this could be especially critical in applications like the metaverse, which requires the coordination of widely separated application elements, in this case each representing an “inhabitant”. A realistic experience depends on having behaviors among interacting avatars synchronized with the person they represent, inhabiting that distributed real world. In fact, the full potential of a metaverse, as I’ve noted in prior blogs, couldn’t be realized without it.

Even IoT applications could be impacted by a lack of edge-specific tools and models. The current “edge” strategy for IoT is to create the edge by pushing the cloud onto the premises, using enterprises’ own servers. This tends to encourage applications that close the most sensitive control loop off before it ever gets off-site, and the public cloud software that’s pulled onto the premises is often “transactionalizing” edge events, recording them after the real-time processing has been completed. If real-time handling is already done before events leave the premises, then there’s latency-critical stuff little left for edge computing to do, and the whole business model of the edge is questionable.

The features needed to make all this work are going to be turned into platform software, PaaS, or SaaS by somebody. Operators, by diddling over the question of whether to build their own cloud or buy into one or more cloud provider partners, are missing the big question. Will they build that platform software? If they don’t, then they absolutely surrender edge differentiation, probably forever.

Initiatives Take Hosting Beyond x64, and Maybe Define the Edge

If edge computing is different from cloud computing, then it would seem likely that there are technical elements that would have different emphasis in those two spaces. One such element is fundamental to both; hosting. The differences, and the reasons for those differences, arise out of the mission of edge versus the mission of cloud.

There’s no reason to put hosting close to the edge other than to reduce latency. Edge is by nature going to be spread out, meaning more real estate costs and more difficulty in achieving optimum economy of scale. The justification for accepting these challenges is that some applications require low latency, lower than can likely be achieved through reliance on regional or central cloud data centers. Applications that require low latency tend to be event-driven rather than transactional, the traditional model of data center apps, and as I noted in an earlier blog, that often means a different CPU architecture, or even a GPU.

Architectural shifts don’t stop there, either. We already see white-box devices built with custom switching chips (Broadcom’s for example), and there’s also growing interest in adding intelligence to interface cards to offload work and improve overall performance. This trend means that it’s very likely that we’ll see “servers” that serve something other than x64 hosting and that include local interface intelligence as well as a collection of CPU/GPUs.

Then there’s AI, which is spawning its own set of custom chips. Google’s Tensor chip, used in the new Pixel 6 family, has an integrated AI process accelerator, and vendors like Cadence and Maxim Integrated have AI processors. Experience already shows that these are likely to be integrated into server platforms, combined with all the other CPU, GPU, networking chips, and interface smarts.

AI is already popular as a facilitator for some IoT applications, and as we start to face IoT for real (rather than for-ink, media-driven), we may well come up with IoT processes that could justify another set of custom silicon. The point is that as the volume of anything increases, the pressure to enhance its efficient processing increases too. If we do face edge, cloud, and network revolutions, those revolutions could be drivers of yet more diversity of custom chips.

Unless we assume that there’s only one chip vendor and one standard architecture for each of these chip types, there’s going to be an issue matching software to the hardware mixture of a given server. If we have three or four possible chip types, in two or three different versions, the combination that developers would have to deal with could be daunting; you could end up with a dozen or more different software versions, even if you took steps to constrain the mixture of chips in your deployment.

All of this poses management challenges at several levels. Deployment and redeployment of edge software (microservices) has to accommodate the specific architecture of a given server platform, if the architecture impacts how efficiently some microservices will run on a platform, or even if they’ll run at all. You could tune current orchestration tools (Kubernetes, Linkerd, etc.) to match complex platform features with complex microservice hosting needs, but the more custom chips you introduce, the more combinations you can expect, and the harder it is to manage the resulting puzzle.

The other level of management challenge this all creates is in the area of lifecycle or fault management. A multiplicity of platform configurations, combined with different data center switching vendors, can create a management problem in the very spot where you can’t afford to have one, where real-time systems that generate events are hosted.

This is the kind of challenge that cries out for a virtualization/abstraction solution. The smart approach at both the development and operations level would be an abstraction of the functionality of a given chip type, with a “driver” (like the P4 driver in switching chips) that matched the abstract or virtual device to one of multiple implementations of the basic functionality. We don’t have a good abstraction for anything much today; even P4 isn’t supported universally (in fact, the most commonly used chips, from Broadcom, don’t support it). In the near term, it’s likely that white-box vendors will provide platforms that have a specific chip/vendor mix and provide the necessary drivers. That could mean that software will initially support only a subset of the possible combinations, since there’s not even a good driver standard today.

On the operations side, we’re already facing issues with both GPUs and NIC chips. These new problems are exacerbated by the fact that the “clusters” of hosting and switching that make up a resource pool don’t present a common operations model. Adding in specialized elements can only make that worse, and edge computing has enough of a challenge in creating a resource pool with reasonable economies as it is.

Juniper has been on a tear recently in pushing some innovative concepts, including their spring Cloud Metro initiative and their recent media/analyst/influencer event where they promoted a new vision of network AI. They’ve now announced an initiative with NVIDIA to extend “hosting” to the NIC, and to bring NIC intelligence under their universal-AI umbrella. Given the edge shift and the fact that Cloud Metro defines a harmony of network and edge computing, it’s not a surprise that this new capability is called “Juniper Edge Services Platform” (JESP). JESP APIs link with the Apstra software to extend management and orchestration into the NIC.

JESP is based on NVIDIA’s BlueField DPU, where “DPU” is “data processing unit”, meaning the server platform itself. The idea is to extend the data center network (which is deployed and managed through Juniper’s Apstra acquisition) right to the servers’ interfaces, where those “smart NIC” chips are likely to be deploying. My own data from operators suggests that by early 2024, more than half of all newly shipped NICs will be smart (Gartner is more optimistic; they say 2023). In the edge, however, it’s likely that smart NICs will dominate quicker.

NVIDIA also has a plan to accelerate software harmony via its DOCA software model, a model that extends beyond just interface smarts to include even the P4 driver for switching, and therefore could describe a network element (including a server) with a single abstraction. DOCA stands for “Datacenter-On-a-Chip Architecture”, and it’s intended to accelerate the BlueField DPU concept, but it looks to me like it would be easy to extend the model to a white-box system, too.

There are two levels to any abstraction, the mapping to resources and the exposure of functional APIs. We’re seeing a lot more abstraction in the resource side, but applications at the edge and in the cloud still depend on middleware resources as much as hardware. It’s going to be interesting to see whether initiatives like DOCA and JESP will stimulate work on that middleware framework, something that would depend on thinking through the needs of edge hosting, and how it differs from cloud hosting. If it does, we could actually see the edge advance, and quickly.

Is IBM on to Something With Kyndryl?

Could Kyndryl, IBM’s infrastructure services unit spun off as a separate company, be a pathway for IBM to resolve its challenges? I blogged recently about IBM’s quarter and its challenging choice for a cloud strategy. Kyndryl seems to be taking on a broader role; their NYSE listing speech said “We design, build, manage and modernize the mission-critical technology systems that the world depends on every day. As a focused, independent company, we’re building on our foundation of excellence by creating systems in new ways.” That sounds way more like a broad systems integrator mission. Would it help IBM face it’s cloud challenge, and might it even indicate how other companies will face similar challenges down the road?

Back in the proverbial Good Old Days, companies installed their big computers in a data center and ran their applications there. The cloud came along as the end game in a virtualization trend, a trend that aimed at making applications and their components easily movable to any of many suitable hosts, a “resource pool”. That can have a major impact on QoE and resource efficiency, but there’s no question that it adds significantly to complexity.

A “data center” isn’t just a couple big computers any more. It’s multiple locations, a combination of real hosts and as-a-service features acquired from outside, a set of network resources, a bunch of tools and middleware, and a whole set of application design practices. What the users want, which is their base of applications, is spread out all over the place, connected via their network and the Internet. Users often can’t figure out how to put all the stuff together, and vendors can easily find themselves in a never-ending buyer education task before the prospect can even lay out an RFP, and then not win the darn thing when it’s awarded.

A “network” isn’t just a bunch of trunks and nodes, either. It’s devices, clusters of devices, edge and cloud computing, function-hosting standards, management automation standards, and a whole new set of principles, sometimes several different ones in the same network. Users aren’t any better at putting this new network together than they are the new data center, and vendors in the network space face the same problems as IBM and other compute vendors.

For a company like IBM, this issue has two dimensions. First, any sale is likely to become education-dependent, particularly given that IBM’s greatest strength is account presence and control at major enterprises. Who better to tap for knowledge than someone in the building? Second, the expanding scope of technology means that new stuff is likely to intersect the boundaries of things already deployed and fully depreciated, and to involve technologies that no one vendor really offers.

Managed network services from Managed Service Providers (MSPs) is a proof point of this on the network side, and the success of cloud computing proves it on the hosting side. But most enterprises still need to build stuff; as-a-service isn’t always available, suitable, or economically attractive. That’s why an integrator may be an essential element in the future—for IBM specifically, but for tech in general as well.

Integrators are top-down players, often in one or several specific industry verticals because the level of skill and scope of business knowledge needed may be hard to acquire across the whole market. They could bridge the gap between the user who lacks the skills required to plan and deploy something, and the vendors who can’t get involved in the education process. An integrator subsidiary is one option, but that doesn’t cover the problem of the outside technology that might be needed, and it also has the risk of encouraging buyers to ask for integration services for free, given that they’re buying the gear.

Most vendors have channel programs that include support for integrators, and there are some very large integrators out there already; the top ten revenues range into the billions of dollars. These giants aren’t even candidates for most smaller projects and smaller buyers, but again vendor channel programs will include integrators of all sizes. Integrators are already a big chunk of the revenue stream for most network and IT vendors, and their experience shows strengths and limitations of the integrator concept. The strengths relate to specialized skills, broader product base, and explicit costs. The limitations relate to lack of credibility, sticker shock on pricing, and channel conflicts. Can Kyndryl get those in balance?

Kyndryl is going to inherit a lot of IBM’s credibility with major accounts; in many cases, those enterprises will already have contacts in the new organization. IBM had strong strategic influence with these companies, and it’s likely Kyndryl will inherit at least a lot of that, too. Obviously, the new organization will have an impressive skill set, including a good amount of vertical market expertise, cloud expertise, and that “mission-critical” application expertise.

The downside is the dance that Kyndryl will surely have to do, with prospects, prospective suppliers, and even Wall Street. For prospects and other suppliers, you have the “independent-in-name-only” issue. Can an organization once part of IBM be truly independent? Will they still favor IBM products and services? Will they feed back information through personal relationships they have with former IBM colleagues?

For Wall Street, the obvious elephant in the room is why Kyndryl will earn more as an independent integrator than it did as part of IBM. The stock closed off for the week it was first listed, though not (quite) at its low. The CEO comment that “Now we have complete freedom of action” doesn’t prove that they know the best actions to take, or are able to take them.

One very specific question Kyndryl will have to face to establish its value to IBM, directly or indirectly, is Red Hat. IBM’s biggest problem has been its shrinking total addressable market, a result of dropping out of much of the computer hardware space and having no credible presence anywhere except major enterprises. Red Hat not only established IBM as a strategic player again, it’s been largely responsible for its gains since the deal closed. Red Hat has a pretty nice hybrid cloud and “integration” story, though they use the term more at the software-connection level than at the technology services level. Does Kyndryl know anything about Red Hat stuff at all? If not, then it can’t resolve IBM’s downmarket problem. That means its stuck in the same IBM enterprise base that wasn’t enough for IBM’s success, and that drove IBM to acquire Red Hat.

IBM says that “Kyndryl has a robust portfolio of multicloud services available for Red Hat OpenShift….”, and the reference I cite here lists examples. From this you could see that Kyndryl will include Red Hat software in its inventory of technologies, but there’s still a question of the prospect base. Kyndryl’s inherited IBM base wouldn’t include anywhere near all the prospects for OpenShift. Will Red Hat salespeople push Kyndryl integration? If not, who reaches those non-IBM-centric prospects?

For all the challenges, Kyndryl may be a good move for IBM, and a signal that other vendors in both IT and networking are going to have to address integration more thoughtfully. Every vendor faces the education barrier to technology adoption, and most have established perhaps-somewhat-casual integrator programs, as part of their channel sales strategy. Every vendor doesn’t have their own integration activity to spin out, as IBM did, but every vendor will need to ask whether they should be lumping integration programs with overall channel sales. The risk of integrators doing their own educational selling, to be trumped by simple discount channel sales from others when the education is done, is simply too great.

The best solution may be a better-organized channel program, something that network and sometimes-server vendor Cisco launched last year. Channel programs can go a long way toward reducing channel-conflict risks for integrators, and coupling channel programs more tightly to the vendor can not only give the vendor more ability to police channel behavior, it can add to channel credibility by letting the vendor backstop the solutions offered, making smaller and more specialized integrators more credible.

It’s hard for me not to see the Kyndryl spin-out as an attempt by IBM to use “independent” integration services to unify their story with Red Hat and take it down-market. If that’s all there is, then Kyndryl could be a lot worse than a strong integrator channel program. If it’s not, then IBM and Kyndryl need to frame the relationship they really intend, and prove its value to everybody.

Analyzing Cisco’s View of NaaS

Cisco released a report on network-as-a-service (NaaS) that’s eye-opening in some ways and utterly predictable in others. On the one hand, it’s easy to see what users are hoping NaaS will do for them, and the range of their hopes is broader than I’d thought. On the other hand, the report shows a remarkable consistency with past surveys by proving that most people who responded were just winging it.

I’ve noted in several past blogs that users tended to respond to surveys in a way that made them look good, rather than by relating what they really did or even knew to be true. On the average, I’ve found that at least a third of users who respond to surveys don’t relate reality, and in this case Cisco found that 36% of their respondents said they already had NaaS when likely the number who met Cisco’s definition was in the statistical noise level. More on this later.

Cisco sees NaaS as “a cloud-enabled, usage-based consumption model that allows users to acquire and orchestrate network capabilities without owning, building, or maintaining their own infrastructure.” Cisco said they believed that many users who simply had a managed service provider (MSP) believed they were NaaS users, which would make NaaS more a retail packaging factor than a technology. I think it likely goes further than that, and we’ll have to keep that in mind when we look at some of the specific things Cisco’s survey found. I won’t include them all, just those that directly relate to the question of what a NaaS really is.

The first point in Cisco’s results is that “IT teams recognize the top NaaS benefit as freeing up IT teams to deliver innovation and business value (46%). Another 40% recognize NaaS as improving response to disruptions and 34% as improving network agility.” Obviously multiple responses were allowed here, but I think that the responses show that IT personnel (presumably CIO-level and direct reports) believe that the cloud as-a-service model succeeded in breaking the dependency on owned, fixed, infrastructure for computing. They now want to apply that same model to networking.

To me, this says that what users want from NaaS could in fact be delivered by any service that was usage-priced, resilient, and scalable. You could fit that pricing model to an IP VPN, for example, and they’d be happy. The cloud, then, is less a technical requirement for a NaaS implementation than it is the way that computing technology broke the infrastructure dependency for computing.

The next critical point is “The top services required from NaaS providers are network lifecycle management (48%), network resiliency (42%), and monitoring and troubleshooting to meet SLAs (38%). This again seems to be a translation of cloud features into the NaaS space, suggesting that users are seeing cloud computing as the go-to as-a-service framework and expecting what they want from it to be available for NaaS too.

The interesting thing here is that none of the responses talks about new features of NaaS. There’s nothing about NaaS as a per-user, per-application, sort of service, what I’ve suggested is connectivity as a service. There are no new security features cited, either. Again, these requirements could be met without making real technical changes to existing VPN services.

Some security issues are raised in the next point: “Concerns range from whether NaaS can support unseen emerging demands (30%) to loss of security control (26%). The cost and disruption of transitioning also ranks high (28%).” Note that there’s less unanimity on concerns than on benefits and expectations. The first of the concerns shows a continued presumption of a static service model for networks (can the cloud support unseen emerging demands? Sure.) Disruption and transitioning costs, the next-biggest concern, seems to suggest that users fear change more than a lack of specific new service features. Security issues are on the bottom, and my own surveys have shown that users will always say they believe there are management problems or security problems if they need to cite things about a new technology that they find worrisome.

The next point says that 75% of users agree that NaaS will give IT teams and professionals opportunity to advance their skill set, but a quarter of users indicated they’d trust an integrator rather than their own staff to put NaaS in place. This seems a bit of a contradiction on the face, but there’s a deeper contradiction too.

It seems to me that the high rate of response on this point indicates it was suggested (“Do you think NaaS will help IT teams advance….”) rather than a spontaneous comment. It also seems to contradict the assumption of prior points, which was that the biggest goal of NaaS was to reduce adoption/transformation burden. Finally, isn’t the sense of earlier goal comments that NaaS was supposed to mean less stuff to own and manage? What is an integrator even doing in that situation?

The final point I want to highlight is the view that NaaS was most likely to be adopted during an upgrade cycle, and that SASE and multi-cloud were likely to be entry points to NaaS. There’s little question that every major change to networks or IT infrastructure would be timed to the point where current assets were largely written down; the cost of the write-down would have to be borne by the project. And since SASE is a “service edge”, it follows it’s a NaaS edge. There’s not much insight conveyed here, in other words.

Now, I want to contribute the results of my own enterprise conversations on NaaS, but to do so I need to return briefly to the point I made earlier about surveys. The most important single truth about surveys is that they produce results when the people surveyed are actually involved with the question being surveyed. If you want to understand what issues currently face network operations, you talk to an operations person. If you want to understand how an enterprise would see its service needs evolving, you talk to a network planner or even an application planner. It shouldn’t be a surprise to any of you who read my blog that I’m a futurist, a strategy consultant. I talk to the latter group, the planners. Vendors usually survey the former, and that results in a difference in perspective.

Planners tell me that they hope NaaS will be an almost-personal thing. You, as a user of the network to reach information and applications, have specific connectivity needs. You want them satisfied, and from your perspective there’s nothing that’s really aiming at that. Planners, hearing from their users, see the possibility for communications that supports the information relationships each user needs, as a service delivered to that user. To these planners, the user gets an IP address and has the ability to decode URLs, and that’s the extent to which they want to interact with the network. Everything else is set up by policy.

In this mode, “secure access service edge” or SASE is a logical gateway to NaaS, but it’s important to understand that having an SASE doesn’t imply that you have NaaS, or even that you could support it. NaaS is a network relationship with the user, like any as-a-service model. To distinguish NaaS from a simple VPN, you have to accept that it goes beyond site networking (which is what a VPN provides) and into user-specific connectivity. Cisco’s paper doesn’t address this, likely because it’s a survey of network people and not of network planners/users.

What’s the Value of Cloud-Native in Network Software?

I’m sure that you, like me, has read plenty recently about “cloud native” technology in telecom. Given the fact that hype seems to be omnipresent in tech these days, we have to ask whether there’s more “cloud-native-washing” going on than actual “cloud-native”. Rather than try to survey all claims, why not start by asking what meaning the concept could have, and see where that leads us to look for its application. For a change, I propose to start at the bottom.

Networks are a web of trunks and nodes that create routes for traffic to follow. There’s typically a star-of-stars topology, where access tendrils collect at some metro point, and metro points then collect regionally. We’ve all seen examples of this, both as a whole and in pieces, over our careers.

Cloud-native refers to an application model that describes a style of construction that’s optimized for virtual/cloud deployment. That means that they’re designed to be scalable under load and redeployable in case of failure. In the real world, cloud-native is optimizing the benefits of virtualization, which is the abstraction of resources so that an application that that maps to the virtual is transportable to the real.

Both these last paragraphs state the obvious (to many, anyway), but they also reveal the first of our truths about cloud-native in networking. Networks, at the bottom where those trunks live, are essentially nailed to the ground. We can’t just say “OK, host the virtual router over there” when there may not be a trunk, or the right trunk, in that location. The scope of real resources suitable for mapping to bottom-level network functions is limited. However, if we think about rising up from the physical, what we’d generally call the “data plane” we find that we have more and more latitude with regard to where the real stuff we map to could be located.

Take 5G as an example. Towers are where they are, and so is backhaul fiber. We need to have data-plane resources that are very tightly coupled to the endpoints of real transmission media. If we move upward to the control plane, we still have to be somewhere generally proximate to the media and data plane, but the O-RAN reference to “Near-” and “Non-real-time” RAN Intelligent Controller (RIC) shows that there is, even in O-RAN, a widening scope of hosting options as we climb up from the data plane.

Let’s now shift back to my comfortable top-down-think. What’s at the top of a network? The answer is “operations systems”, NMS and OSS and BSS. These functions are not only high above the data plane, their scope of operation is network-wide, which means that there’s probably no specific single location to be close to at all.

The wide scope of operation is important in itself. Networks are not homogeneous, nor are the impacts of a problem. If one of the goals of cloud-native design is scalability, then we could expect there to be a greater need of cloud-native in areas where scaling is more likely to be needed, which would likely correspond to places with higher device, trunk, and customer density. If high availability is another goal, then we could expect it to be most needed where faults are more likely, which could be places with power and environmental issues.

What emerges from this little exercise is a representation of cloud-native value shaped like a kind of inverted jello cone. The tip of the cone represents the lower-layer functions, so the cone is sitting tip-downward on the network. The base of the cone, now at the top, represents high-level operational functions. The jello property means that we can push on the cone and shift the top a fair amount, representing flexibility in hosting, but the bottom piece is anchored and so moves very little.

In the data plane, at the bottom of our cone, we are accustomed to “nodes” being “devices” in a singular sense, and that’s not unreasonable given the fact that there’s little flexibility in where we place them relative to the trunk interfaces they support. Physical media demands a physical interface, and if we were to split a router into cooperative components, those components would have to be fairly close to each other in order to avoid introducing a lot of latency. Thus, a chassis router model is a smart one, creating a kind of large virtual router from the composite behavior of elements. The control plane of this router could be further separated to support hosting it on a nearby pool of elements, but it’s not likely that anyone would attempt to justify a control-plane resource pool; a bit of redundancy would serve.

If we step up a titch from the data plane, we’d find a kind of fork in the road. Taking one turn leads us to the IP control-plane functionality, notably the adaptive or centralized route control. Taking the left fork leads us to service control planes like those of 5G. In both these cases, we can assume that controlling latency is much less critical, and so the functions of the control plane could be more distributed, which means that they could likely justify a resource pool strategy. The question would be where the pool was located, and that relates mostly to the scope-of-operations issue.

Most 5G functionality is likely to reside at the metro level, where 5G RAN (New Radio, or NR) and Core are joined. That means that any pooling of resources would have to be a form of edge computing, and one thing we’d need to consider is the relationship between “the edge” as a kind of latency-specialized cloud and the O-RAN O-Cloud, which is controlled by the RICs. Do we need, in edge computing, an architecture that makes that relationship explicit, or do we simply share resources in some mediated way? In any event, while this functionality could be implemented in cloud-native form, we’d need to consider what specific cloud model we were being native to.

IP control-plane hosting, the route management part that’s not embedded with the data plane, is best considered by looking at the centralized control plane of SDN (OpenFlow and the ONF model). Centralized route control’s latency requirements are minimal because it takes considerable time for the alternative adaptive topology convergence to happen. It’s likely that we could host this in the cloud, and that it would be suitable for cloud-native development.

It’s the management/operations stuff, the things at the top of our inverted cone, that have the greatest flexibility in terms of hosting location, because they’re typically linked to human interactions and humans aren’t particularly real-time. NMS, OSS, and BSS systems could be viewed as being IoT-like, event-driven, processes at the NMS level, then gradually becoming more transactional as we move to OSS and BSS. It would surely be possible to implement an NMS in cloud-native form, but the OSS/BSS piece is more complicated.

Transactional applications have limited scalability because they require database activity to complete. If we have a half-dozen instances of a database update component, we’ll have to exercise some distributed database update/access discipline to keep our repository from getting corrupted or ensuring users always get correct data on a read. This limit to the scalability benefit could mean that OSS/BSS could justify cloud-native at the front end of the applications, but tend more to traditional structures to serialize processing as we move to the database side.

The notion that network software must be “cloud-native”, or even the claim that making it so is a good thing, is an oversimplification. We surely have opportunities for “microservice” application models, including orchestration and service meshing, but the specific way that this would be done is likely to change as we move from data-plane handling upward to OSS/BSS. It would be smart to keep that in mind when you consider cloud-native claims for telecom/network software implementations.

The Dynamic of Two Tier Ones Show Wireline Directions

It’s always interesting, and even useful, to look at how AT&T and Verizon are doing in the broadband/wireline space. Verizon has led in new home broadband technologies with its early Fios push and now with fixed wireless (FWA), and AT&T has been much more aggressive in pursuing a position as a content provider. It’s common to pick one issue or the other and use it to justify favoring one operator or the other, but I think the truth is that both operators need to face both issues, and there are barriers to that happy state in play for them both. You can see some of these differences, and issues, in their quarterly reports.

To set the stage, AT&T and Verizon have very different wireline territories, a residue of the breakup of the Bell System and the Regional Bell Operating Companies. AT&T also merged with/acquired Bell South, which added more (and different) geography, where Verizon didn’t add a lot of geography. That’s likely because Verizon’s territory is much more economically dense, which means that wireline infrastructure is more likely to be profitable for them. The two companies have followed different paths because they have this basic difference of what I’ve been calling “demand density”, which controls the ROI on access infrastructure. Verizon’s is good, but AT&T’s is much lower.

AT&T talks constantly about ways they’re saving on capex and opex, obviously because they need to boost their infrastructure ROI. They’ve been the most aggressive of all the Tier One operators in adopting open-model technology, changing their network, using smaller vendors, you name it. That’s not going to stop, and in fact I think it’s likely to ramp up over the next two years because of Wall Street concerns about their ability to sustain their dividend.

Cost management, as I’ve said in the past, is essentially a transitional strategy. It vanishes to a point; you can’t keep relying on it because it’s impossible to lower costs indefinitely. At some point you have to get top-line revenue growth, and AT&T has recognized that with its various media/content moves, the latest of which is the WarnerMedia move. Unlike AT&T’s cost management strategies, though, it’s media deals haven’t paid off for them so far (which of course is what’s behind the DirecTV and WarnerMedia stuff).

Verizon, as I’ve noted, has the advantage of a geography whose demand density matches those of traditionally broadband-rich countries. With a much higher potential ROI on infrastructure investment, they’ve had little pressure for aggressive cost management, and similarly little pressure to acquire content companies. Their demand density has not only contributed to their aggressive Fios fiber plans, but also to their taking an early and strong position with 5G mm-wave (5G/FTTN) technology.

Generally, high demand density not only means dense urban areas, but wide-ranging and rich suburbs. That makes it much easier to deploy mm-wave technology where you can’t quite justify fiber. In effect, the demand density gradient from most dense (urban) to least (rural) is less pronounced for Verizon. They also have a competitive consideration; the biggest cable company (Comcast) is a major player in their region and CATV has a favorable pass cost, much lower than fiber. A mm-wave position is a way of keeping the Comcast wolf from the door.

Demand density also helps in the mobile space. Operators with wireline territories tend to do much better with mobile services within that territory, even if they offer broader mobile geography. Verizon always does well in national surveys of mobile broadband quality, and their success with 5G mobile service raises the chances that they’ll eventually offer 5G broadband as a wireline alternative in rural areas where mm-wave isn’t effective.

Given profitable wireline broadband, there’s less pressure on Verizon to embrace its own streaming. They do have Fios TV in linear form, and they offer a streaming box, but it supports other streaming services. Verizon sold off Verizon Media, which was not “content” in the sense of AT&T’s deals, and that suggests strongly that they’re not in the market for video/content assets. I don’t think that’s going to change, because they can make a go of their wireless and wireline businesses.

Right now, AT&T and Verizon are perhaps more competing on strategy than on sales, given that for network operators, the planning horizon is very long. Verizon is a network player, and AT&T is more and more like Comcast, a broadband-and-media player…or they’re trying to be. Not only is there a strategy difference today, there may well be a bigger one emerging…the edge.

Edge computing is really mostly about metro deployment of compute assets. Yes, there are stories about putting compute resources way out toward the real edge, but that’s mostly associated with function hosting for 5G. If you’re really going to sell edge computing on any scale, you need a population of prospective buyers who are packed into an area compact enough that latency can be controlled within it.

Think for a moment about the economic realities of edge computing. An investment in “telco cloud” by an operator is very much like an investment in wireline broadband infrastructure. You plop down a resource and it has, in its spot, a market radius it can serve. The more prospective revenue lives within that market radius, the more profitable any investment will be. More profit means lower risk, a larger bet you can make. If an edge model emerges, it will be far easier for Verizon to realize it because they have extraordinarily dense metro zones spread throughout their wireline footprint.

For AT&T, edge computing can offer comparable returns to Verizon’s only in some major metro areas; their population is much more spread out. They have a lot of smaller cities that are going to present a challenge in terms of profitable edge deployment. I think AT&T may see public cloud partnerships in 5G hosting as less of a long-term risk because they can’t yet see a long-term option for deploying their own cloud. Why not grab some benefits from a partnership when no roll-your-own is on the horizon?

This presents a challenge and an opportunity for vendors. What would a profitable metro model look like? How could you evolve to it from your current position in both 5G and metro networking? Would it scale to smaller cities, and could you network metros together, both to improve the breadth of latency-sensitive services and to better serve communities whose “edge” doesn’t touch enough dollars to permit deployment of a pool of hosting with economy of scale?

The cloud, and the network, are evolving. The players in both spaces are demonstrating that they understand that, and they’re also struggling to understand how to deal with it. The way this critical technology fusion works out will decide the future of a lot of services and a lot of vendors.