How Smart Chips are Transforming Both Computing and Networks

I don’t think there’s any disagreement that network devices need to be “smart”, meaning that their functionality is created through the use of easily modified software rather than rigid, hard-wired, logic. There is a growing question as to just how “smartness” is best achieved. Some of the debate has been created by new technology options, and the rest by a growing recognition of the way new missions impact the distribution of functionality. The combination is impacting the evolution of both network equipment and computing, including cloud services.

To most of us, a computer is a device that performs a generalized personal or business mission, but more properly the term for this would be a “general-purpose computer”. We’ve actually used computing devices for very specific and singular missions for many decades; they sent people into space and to the moon, for example, and they run most new vehicles and all large aircraft and ships. Computers are (usually) made up of three elements—a central processing unit (CPU), memory, and persistent storage for data.

Network devices like routers were first created as software applications running on a general-purpose computer (a “minicomputer” in the terms of the ‘60s and ‘70s). Higher performance requirements led to the use of specialized hardware technology to manage the data-handling tasks, but network devices have from the first retained a software-centric view of how features were added and changed. All the big router vendors have their own router operating system and software.

When you attach computers to networks, you need to talk network to the thing you’re attaching to, which means that network functionality has to incorporated into the computer. This is usually done by adding a driver software element that talks to the network interface and presents a network API to the operating system and middleware. Even early on, there were examples of network interface cards for computers having onboard intelligence, meaning that some of the driver was hosted on the adapter instead of on the computer.

The spread of software hasn’t meant that hardware spread was halted. Specialized chips for networking have existed for decades, and today switching and interface chips are the hardware soul of white-box devices. In the computer space, specialized chips for disk I/O and for graphics processing are almost universal. It’s not that you can’t do network connections, I/O, or graphics without special chips, but that you can do them a lot better with those chips.

So why are we now hearing so much about things like smart NICs, GPUs, IPUs, DPUs, and so forth? Isn’t what we’re seeing now just a continuation of a revolution that happened while many of today’s systems designers were infants, or before? In part it is, but there are some new forces at work today.

One obvious force is miniaturization. A smartphone of today has way more computing power, memory, and even storage than a whole data center had in the 1960s. While the phone does computing, graphics, and network interfacing, and while each of these functions are chipified individually, there’s significant pressure to reduce space and power requirements by combining things. Google’s new Pixel 6 has a custom Google Tensor chip that replaces traditional CPUs and incorporates CPU, GPU, security, AI processor, and image signal processor functions. IoT devices require the same level of miniaturization, to conserve space and of course minimize power usage.

Another force is a radical revision in what “user interface” means. By the mid-1980s, Intel and Microsoft both told me, over two-thirds of all incremental microprocessor power used in personal computers was going to the graphical user interface. That’s still true, but what’s changed is that we’re now requiring voice recognition, image recognition, natural language processing, inference processing, AI and ML and all those other things. We expect computing systems to do more, to be almost human in the way they interact with us. All that has to be accomplished fast enough to be useful conversationally and in the real world, and cheap in terms of cost, power, and space.

Our next new force is mission dissection, and this force is embodied by what’s going on in cloud computing. The majority of enterprise cloud development is focused on building a new presentation framework for corporate applications that are still running in the usual way, usually in the data center, and sometimes on software/hardware platforms older than the operators that run them. The old notion of an “application” has split into a front-end/back-end portion, and the front-end piece is a long way from general-purpose computing and the back-end piece a long way from a GUI. In IoT, we’re seeing applications broken down by the latency sensitivity of their functions, in much the same way as O-RAN breaks out “non- and near-real-time”.

The final new force is platform specialization encouraging application generalization. We separated out graphics, for example, from the mission of a general-purpose CPU chip because graphics involved specialized processing. What we often overlook is that it’s still processing that’s done by software. GPUs are programmable, so they have threads and instructions, and all the other stuff that CPUs have. They just have stuff designed for a specific mission, right? Yes, but an instruction set designed for a specific mission could end up being good for other missions, not just the one that drove its adoption. We’re seeing new missions emerging that take advantage of new platforms, missions that weren’t considered when those platforms were first designed.

What do these forces mean in terms of the future of networking and computing? Here are my views.

In the cloud, I think that what we’re seeing first and foremost is a mission shift driving a technology response. The cloud’s function in business computing (and even in social media) is much more user-interface-centric than general-purpose computing. GPUs do a great job for many such applications, as do the RISC/ARM chips. As we get more into the broader definition of “user interface” to include almost-human conversational interaction, we should expect that to drive a further evolution toward GPU/RISC and even to custom AI chips.

The edge is probably where this will be most obvious. Edge computing is all about real-time event handling, which again is not a general-purpose computing application. Many of the industrial IoT controllers are already based on a non-CPU architecture, and I think the edge is likely to go that way. It may also, since it’s a greenfield deployment, shift quickly to the GPU/RISC model. As it does, I think it will drive local IoT to more a system-on-chip (SOC) by unloading some general functionality. That will make IoT elements cheaper and easier to deploy.

At the network level, I think we’re going to see things like 5G and higher-layer services create a sharp division of functionality into layers. We’ll have, to make up some names, “data plane”, “control plane”, “service plane” (the 5G control plane would fall into this, as would things like CDN cache redirection and IP address assignment and decoding), and “application plane”. This will encourage a network hardware model that’s very switch-chip-centric at the bottom and very cloud-centric (particularly edge-cloud) at the top. I think it will also encourage the expansion of the disaggregated cluster-of-white-boxes (like DriveNets) model of network devices, and that even edge/cloud infrastructure will be increasingly made up of a hierarchy of devices, clusters, and hosting points that are all a form of resource pool.

What’s needed to make all this work is a new, broader, notion of what virtualization and abstraction really mean, and how we need to deal with them. We know, for example, that a pool of resources has to present specific properties in order to be suitable for a given mission. If we have a hundred servers in New York and another hundred in Tokyo, can we say they’re a single pool? It depends on whether the selection of a server without regard for location alters the properties of the resource we map to our mission to the point where the mission fails. If we have a specific latency requirement, for example, that wouldn’t likely be the case. We also know that edge computing will have to host things (like the now-popular metaverse) that will require low-latency coordination across great distances. We know that IoT is “real-time” but also that the latency length of a control loop can vary depending on what we’re controlling. We know 5G has “non-real-time” and “near-real-time” but how “non” and “near” aren’t explicit. All of this has to be dealt with if we’re to make the future we achieve resemble the future we’re reading about today.

We’re remaking our notion of “networking” and “computing” by remaking the nature of the hardware/software symbiosis that’s creating both. This trend may interact with attempts to truly implement a “metaverse” to create a completely new model, one that distributes intelligence differently and more broadly, and one that magnifies the role of the network in connecting what’s been distributed. Remember Sun Microsystems’ old saw, that “The network is the computer?” They may have been way ahead of their time.