Where is the best place to host computing power? Answer: Where it’s needed. Where’s the most economical place? Answer: Probably not where it’s needed. The dilemma of the cloud, then, is how to balance optimality in QoE and the business case. I’m going to propose that this dilemma changes the nature of the cloud, and makes a better definition of an “edge” a “local” processing resource. A Light Reading article quotes Dish as saying that for 5G, “the edge is everywhere”. Is that true, or is it true that the cloud is subsuming the edge?
The relationship between humans and applications has been a constant process of increasing intimacy. We started off by keypunching in stuff we’d already done manually, evolved to online transaction processing, empowered ourselves with our own desktop computers, and now we carry a computer with us that’s more powerful than the one we started reading punched cards with. How can you see this other than the classic paso doble, circling ever closer? It proves that we’ve historically benefitted from having our compute intimately available.
The cloud, in this context, is then a bit counterintuitive. We collectivize computing, which seems to be pulling it away from us. Of course, now the cloud people want to be edge people too, getting them back in our faces, but I digress. The point here is that economics favors pooled resources and performance favors dedicated, personalized, resources.
We could probably do some flashy math that modeled the attractive force of economics and the attractive force of personalization to come up with a surface that represented a good compromise, but the problem is more complicated than that. The reason is that we’ve learned that most applications exist in layers, some of which offer significant benefits if they’re pushed out toward us, and others where economics will overwhelm such a movement.
IoT is a good place to see the effect of this. An IoT application that turns a sensor signal into a command to open a gate is one where having the application close to the event could save someone from hitting the gate. However, a mechanical switch could have done this cheaper. In the real world, our application likely involves a complex set of interactions that might actually start well before the gate sensor is reached. We might read an RFID from a truck as it turns into an access road, look up what’s supposed to be on it, and then route it to the correct gate, and onward to the correct off- or on-load point.
This application might sound similar to that first simple example, but it changes the economic/QoE tradeoff of compute placement. Unless the access road is a couple paces long, we’d have plenty of time to send our RFID truck code almost anywhere in the world for a lookup. Since we’re guiding the truck, we can open the gates based on our guidance, and so a lot of the real-time response nature of the application is gone.
Network services offer similar examples of layers of that economics-to-QoE trade, and one place where that’s particularly true is in the handling of control-plane messages. Where is the best place to handle a control plane message? The answer seems simple—the edge—but the control plane in IP is largely a hop function not an end-to-end function. There are hops everywhere there’s a node, a router.
Let’s look at things another way. We call on a cloud-native implementation of a control-plane function. We go through service discovery and run it, and it happens that the instance we run was deployed twelve time zones away. What’s the chance that this sort of allocation is going to create a favorable network behavior, particularly if it’s multiplied by all the control-plane packets that might be seen in a given place?
One solution is to create what could be called a “local cloud”, a cloud that contains a set of hosting resources that are logically linked to a given source of messages, like a set of data-plane switches. Grabbing a resource from this would provide some resource pool benefits versus fixed allocation of resources to features, and it wouldn’t require a major shift in thinking in the area of cloud hosting overall.
Where broader cloud-think comes in is where we either have to overflow out of our “local cloud” for resources, or where we have local-cloud missions that tie logically to deeper functionality. If the IP control plane is a “local cloud” mission, how local is the 5G control plane, and how local is the application set that might be built on 5G? Do we push these things out across those twelve time zones? Logically there would be another set of resources that might be less “local” but would sure as the dickens not be “distant”.
The cloud is hierarchical, or at least it should be. There is no “edge” in a sense, because what’s “edge” to one application might be deep core to another. There’s a topology that’s defined, for a given application, and represents the way that things that need a deeper resource (either because the shallow ones ran out or because its current task isn’t that latency-sensitive) would be connected with it.
This view would present some profound issues in cloud orchestration, because while we do have mechanisms for steering containers to specific nodes (hosting points), those mechanisms aren’t based on the location where the user is, or the nature of the “deep request” tree I’ve described. This issue, in fact, starts to make the cloud look more “serverless”, or at least “server-independent”.
How should something like this work? The answer is that when a request for processing is made, a request for something like control-packet handling, the request would be matched to a “tree” that’s rooted in the location from which the request originated (or to where the response is needed). The request would include a latency requirement, plus any special features that the processing might require. The orchestration process would parse the tree looking for something that fit. It should preference instances of the needed process that were already loaded and ready, and it should account for how long it would take to instantiate the process if it weren’t loaded anywhere. All that would eventually either match a location, triggering a deployment, or create a fault.
The functionality I’m proposing here is perhaps somewhere between a service mesh and an orchestrator, but with a bit of new stuff thrown in. There might be a way to diddle through some of this using current tools in the orchestration package (Kubernetes has a bunch of features to help select nodes to match pods), but I don’t see a foolproof way of covering all the situations that could arise. Kubernetes has its affinities, taints, and tolerations, but they do a better job of getting a pod to a place where other related pods might be located, and they’re limited to node awareness, rather than supporting a general characteristic like “In my data center” or “in-rack”. It might also be possible to create multiple clusters, each representing a “local collection” of resources, and use federation policies to steer things, but I’m not sure that would work if the deployment was triggered within a specific cluster. I welcome anyone with more detailed experience in this area to provide me a reference!
Another point is that if service mesh technology is used to map messages to processes, should that mapping consider the same issue of proximity or location? There may be process instances available in multiple locations, which is why load-balancing is a common feature of service meshes. The selection of the optimum currently available instance is as important as picking the optimum point to instantiate something that’s not currently hosted.
Why is all this important? Because it’s kind of useless to be talking about reducing things like mobile-network latency when you might well introduce a lot more latency by making bad choices on where to host the processes. Event-driven applications of any sort are the ones most likely to have latency issues, and they’re also the ones that often have specific “locality” to them. We may need to look a bit deeper into this as we plot the evolution of serverless, container orchestration, and service mesh, especially in IoT and telecom.