A Placement Optimization Strategy for the Cloud and Metaverse

One of the biggest differences between today’s component- or microservice-based applications, and applications likely to drive the “metaverse of things” or MoT that I’ve been blogging about, is the level of deployment dynamism. Edge applications aren’t likely to be confined to a single metro area; a “metaverse” is borderless in real-world terms because its virtual world is expected to twin a distributed system. Many IoT applications will have to deliver consistently low latency even when the distribution of sensor and control elements means that “the edge” is scattered over perhaps thousands of miles.

As things move around in the real world, or in the virtual world, the configuration of a given experience or service will change, and that’s likely to change where its elements are hosted and how they’re connected. While we’ve faced agility issues in applications since the cloud era began, the agility has been confined to responding to unusual conditions. Real-world systems like a metaverse face real-world movement and focus changes, and that drives a whole new level of demands for agile placement of elements.

Today, in IoT applications, many hybrid cloud applications, and probably all multi-cloud applications, we’re already running into challenges in how to optimize hosting. This, despite the fact that only scaling and failover introduce dynamism to the picture. It’s possible to configure orchestration tools like Kubernetes (for a cluster) or Anthos (for a kind of cluster-of-clusters) to define affinities, taints, and tolerations to guide how we map software components to hosting points. Even for currently less-dynamic applications, with fewer hosting decisions and less complex decisions to boot, this process is complicated. Add edge hosting and metaverse dynamism and I think we’re beyond where traditional operations practices can be expected to work.

We can see the emergence of these issues today, and as I noted in my blog on December 13th, we also have at least one potential answer. I became aware through LinkedIn of a company, Arctos Labs, who was focusing on “edge cloud placement optimization”. They sent me a deck on their approach, and I set up a demo (online) of their software. I propose to use it as an example of the critical middle-layer functionality of edge software that I noted in that prior blog. I also want to see if the capabilities of the software could be applied to more traditional application challenges, to assess whether there’s a way of evolving into it without depending on widespread edge deployment.

Arctos says that edge computing is driving three concurrent shifts (which I’ll paraphrase). The first is a shift from monolithic models to microservices, the second a shift from transactional toward events, streaming, and real-time AI, and the third from centralized to heterogeneous and distributed. These shifts, which are creating requirements for edge-computing models, drive latency sensitivity, workload changes that are difficult to predict, multifaceted relationships among potential hosting points for components, sensitivity to geographic distribution of work, and process points, a need to optimize limited edge economies of scale, and a need to be mindful of energy efficiency.

According to the company (and I agree), “edge applications” will be applications that likely have some edge-service hosting, but are also likely to have public cloud and data center hosting as well. As I’ve noted in prior blogs, an IoT application has multiple “control loops”, meaning pathways that link events and responses. Each successive loop triggers both a response component and activates (in at least some cases) the next loop in the sequence. Devices connect to gateways, to local processing, to edge services, to cloud and data center, in some sequence, so we have “vertical” distribution to accompany any “horizontal” movement of edge elements. It’s trying to distribute components across this hosting continuum that creates the complexity.

Arctos’ goal is to provide “topology-aware placement optimization based on declarative intents”. In practice, what this means is that they convert behavioral constraints like tolerable latency, utilization, and cost into a decision on where to put something in this (as they phrase it) “fog-to-cloud continuum”. In their demo, they show three layers of hosting and provide a manual input of things like hosting-point capacity utilization, latency tolerance of the control loops, and cost of resources. When you change something, whether it’s latency tolerance or hosting-point status, the software computes the optimum new configuration and asks whether it should be activated. The activation would be driven through lower-level platform/orchestration tools (in the demo, it was done through VM redeployment).

I think the easiest way of looking of the software is as a “virtual placement assistant”. You set it up with the declarative intents that describe your application and your resources. You connect it to the management APIs that give it visibility into the infrastructure and applications, and to the control APIs to manage deployment and redeployment, and it then does its magic. In the demo, the setup was designed to assist an operations team, but I don’t see any reason why you couldn’t write the code needed to provide direct resource telemetry and control linkages, making the process either manual-approve as it is in the demo, or automatic as it would likely have to be in many MoT missions.

Let’s go back, though, to that vertical distributability. In the current demo-form software, Arctos is offering what could be a valuable tool for hybrid cloud, multi-cloud, and cloud-associated edge computing. By the latter, I mean the use of cloud-provider tools to extend the provider’s cloud to the premises, something like Amazon’s Greengrass. The software could allow an operator to deploy or redeploy something without having to figure out how to match what they want to all the various “attractors” and “repellers” parameters of something like Kubernetes.

This isn’t a standalone strategy. The software is designed as a component of an OEM setup, so some other provider would have to incorporate it into their own larger-scale package. That would involve creating a mechanism to define declarative intents and relate them to the characteristics of the various hosting options, and then to map the Acrtos stuff to the management telemetry and control APIs needed.

This suggests to me that perhaps since my MoT “middle layer” has to include both the optimization stuff and the ability to define abstract services, that a player who decided to build that abstract services sub-layer might then be the logical partner for Arctos. There doesn’t seem to be anyone stepping up in that space, and MoT and edge services may be slow to develop, so Arctos’ near-term opportunity may be in the hybrid-, multi-, and cloud-extended premises area, which has immediate application.

Arctos is a model of the placement piece of my middle MoT layer, and of course there could be other implementations that do the same thing, or something similar. AI and simulations could both drive placement decisions, for example. Another point that comes out of my review of Arctos is that it might be helpful if there were a real “lower layer” in place, meaning that the hosting and network infrastructure exposed standard APIs for telemetry and management of network and hosting resources. That would allow multiple middle-layer elements to build to that API set and match all the infrastructure that exposed those APIs. However, I don’t think we’re likely to see this develop quickly, which is likely why Arctos is expecting to work with other companies who’d then build their placement logic into a broader software set.

I’m glad I explored the Arctos implementation, not only because it provides a good example of a middle-layer-of-MoT implementation, but because it addresses a challenge of component placement that’s already being faced in more complex cloud deployments. If the metaverse encourages “metro meshing” to broaden the scope of shared experiences, component mobility and meshing could literally spread across the world. Arctos isn’t a complete solution to MoT or vertically linked component placements, but it’s a valuable step.

There may be full or partial solutions to other MoT and edge computing software challenges out there, but not recognized or even presented because the edge space is only starting to evolve. The final thing Arctos proves to me is that we may need to look for these other solutions rather than expect them to be presented to us in a wave of publicity. I hope to uncover some examples of other essential MoT software elements down the line, and I’ll report on them if/when I do. I’ll leave the topic of MoT software until something develops.