Rethinking the Very Nature of Infrastructure

It probably seems to be a silly question, but what is a network, these days? What is a cloud, a service? We’re seeing a series of technology shifts and business changes that are blurring traditional boundaries. At the same time, we seem to be unwilling to look for a new playbook to describe what’s happened, or happening. That puts a lot of capital investment and revenue opportunity at risk, so today we’ll try to sort some things out.

Starting at the top is my favorite approach, and the top of everything now is the notion of a “service”. A service is a useful, valuable, and so billable capability that is delivered in the form it’s expected to be consumed, not as a toolkit to build things that can then produce it. In computing, as-a-service technology has been applied to everything from what’s effectively hosting/server-as-a-service (IaaS) to application services (SaaS). In networking, services have focused on connectivity, but the concept of the connection has elevated from the physical level (Level 1 of the OSI model) upward to Level 2 (carrier Ethernet, VLAN), Level 3 (IP services, including VPNs, SD-WAN, and the Internet) and in some cases, even higher.

When you rise above connectivity at Level 1, you necessarily involve functions like packet processing, functions that could be performed in a purpose-built device but also by something hosted on a server. Do enough of that hosting and you have a resource pool and a cloud, and this is where computing and networking have started to merge.

Actually, they’ve already merged in a sense. IP networking demands a number of hosted functions to be workable, including address assignment (DHCP) and URL decoding (DNS) and including content delivery networks (CDNs). As we add features to improve security, manageability, and discovery, we create opportunities to incorporate these things into the “service” of IP connectivity. As we enhance mobile communications, we’re explicitly committing more “network” features to hosting, and we’re spreading hosting resources to do that.

The network’s potential use of function hosting is important, but particularly important as it relates to specific services. Hosting is attractive to the extent that it’s profitable, meaning that you can sell it or it reduces your costs. Both these attributes are difficult to achieve without some economy of scale, a resource pool. The best place to establish an efficient resource pool is in some central place where you can stick a massive data center, but that approach will put the resources too far from the user to support latency-sensitive services. Put the resource pool where it serves users best, right at the access edge, and you don’t have enough users to justify a pool with reasonable economy of scale.

Deep hosting is problematic for another reason, the challenge of personalization. At the edge of the network, an access connection serves a user. In the core, that user’s traffic is part of what’s likely a multi-gigabit flow that you simply cannot afford to dissect and process individually. Go too deep with function hosting and per-user stuff is impractical. Go too shallow and you’ve specialized your hosting to a single user.

Metro is the sweet spot for function hosting. It’s close enough to the edge that it can support most latency-sensitive applications, and it’s also close enough to allow for personalization of services because you can handle individual customers, even individual sessions where necessary. As it happens, metro is also a great place to host generalized edge computing services, and that’s creating the current uncertain dynamic between network operators and cloud providers.

The problem with edge computing as a general service is that we shouldn’t be calling it edge computing at all. The use of what’s essentially a geographic or topological term implies that the only differentiator for edge computing is where it is. That implies that it could be either the cloud pushing close, or enterprise compute resources pushing out, and either would suggest that it’s just another kind of public cloud computing. As I’ve noted before, an edge market implies that it offers something different from both cloud and premises. In the case of the cloud, it offers lower latency. In the case of the premises, it offers the scalability and resilience and expense-versus-capital-cost benefits of the cloud to applications that were on-premises because of low latency requirements. This is why I’ve said that latency is the driver of the edge opportunity.

Latency is also the source of the challenge, because it’s obvious that we don’t have any significant edge computing service deployment today, and yet we have applications that are latency-sensitive. There’s a boatload of IoT applications that rely today on premises hosting of application components to shorten the control loop. That means that “private” edge is a viable strategy. Could there be other IoT applications, or applications other than IoT that are latency sensitive, that could exploit edge computing? Sure, but what justifies it, what induces somebody to deploy edge resources in the hope of pulling those applications out of the woodwork?

That’s where the network stuff comes in. If network service components, from DNS/DHCP to CDNs, from 5G hosting to NFV, from live streaming channel guides to advertising personalization, were to be hosted in the metro area, they’d justify a local resource pool that could then be applied to general edge computing applications.

The market flap about whether network operators are selling their (edge) souls to the cloud providers has its legitimate roots in this issue. If operators elect to either partner with cloud providers in a shared-real-estate deal, or simply outsource their network function hosting to the cloud, they don’t stimulate their own edge solutions. However, it’s interesting to note that while this debate on who owns the future edge is going on, those very cloud providers are pushing a premises-hosted IoT edge strategy to enterprises.

Cloud providers do not want network operators entering the edge market, for obvious reasons. Do the cloud providers see the impact of 5G and other network service hosting requirements on edge computing as being minimal, at least in the near term? That would explain why they’re pushing to bypass the notion of edge-as-a-service in favor of enterprise do-edge-yourself. They might believe that operators’ IoT interests might tip the scales and induce operators to deploy edge services.

This issue could cloud the next issue, which from a technology perspective is more complicated. If metro is the hosting point of the future, then what exactly makes it up and what vendor provides the critical technology? There is no question that metro has to include hosting, because it’s hosted features/functions that are driving the focus to metro in the first place. There’s no question that it has to include networking, within the resource pools and connecting all the service/network elements. Obviously software will be required, too. What dominates the picture?

I think the answer is “the hosting”, because what’s changing metro is the hosting. There isn’t any metro without an edge-host resource pool, so metro is then built around data centers, and the first new network role is to provide connections within. The second is to provide connections between the data centers in the metro, and between the networked data centers and the access network and core.

If latency is critical and event processing is what makes them critical, then the hosting could end up looking very different from traditional cloud computing, which is largely based on x64 processors. It’s likely that we would see more RISC and GPU processing because those technologies are especially good with microservice-and-event architectures. It’s also likely that we’d elect to do packet processing on smart interface cards because that would make execution of those generic tasks, including security, faster in both throughput and latency terms.

On the software side, things are even more complicated. We have three basic software models in play that might be adopted. The obvious one is the “cloud model”, meaning a replica of the public cloud web services and hosting features. Another is the Network Functions Virtualization (NFV) model promoted by the network operators through the ETSI ISG work that started in 2013. The third is the event-processing model that already exists in IoT, based on lightweight OS (real-time or embedded systems) and minimalist middleware. Each of these has pluses and minuses depending on the application.

The final issue is what I’ll call “edge middleware”, the tools/features needed to support applications that are not only run at “the edge” but that require edge coordination and synchronization over a wider area. This sort of kit is essential for the metaverse concept to work, but it’s also likely to be required for IoT where events are sourced over a wider geography. It’s this edge-middleware thing, and the possibility that it will drive a different software development and hosting model, that I think is behind the story that Qualcomm believes the metaverse could be the “next computing platform”.

The metaverse angle is important, and not just because it’s generating a lot of media interest. The reality of edge computing and latency is that network function hosting isn’t likely to be a true real-time application. Much of the network signaling can tolerate latencies ranging over a second. IoT is demanding in latency terms, but much of its requirements can be satisfied with an on-premises controller. Metaverse synchronization, on the other hand, is very demanding because any lag in synchronizing behaviors within a metaverse impacts its realism, its credibility. That’s why Qualcomm’s comment isn’t silly.

These technology points impact the question of who formalizes the thing that network and cloud may be converging on. The more the technology of that converged future matches that of the cloud present, the more likely it is that public cloud providers will dominate. That dominance would be shaken most if a software model evolved that didn’t match the cloud, which would be the case if the “edge middleware” evolved early. That could happen if somebody got smart and anticipated what the tools would look like, but it might also emerge if the 5G activity promoted a kind of functional merger between the RAN/Radio Intelligent Controllers (RICs) of O-RAN and cloud orchestration.

Network vendors are most likely to be influential in that latter situation, if 5G and network features drive enough metro hosting to push a resource pool into place before separate edge applications like IoT have a chance. If a network vendor pushed their 5G and RIC position to the extreme and included the synchronization and coordination mission I mentioned earlier, that vendor might have a shot at controlling what happens at that up-top convergence point between network and cloud.