If you like euphonic comments, how about “more morphs mix metaverse”? It’s pretty clear already that we’re going to see the original concept of the metaverse broadened to the point where it could apply to nearly everything. That’s too bad, because there are some (more euphonics) multi-faceted metaverse missions that actually could impact the architecture and evolution of the concept, and they could get buried in the hype.
The simple definition of a metaverse is that it’s an artificial reality community where avatars representing people interact. That’s what Meta (yes, the Facebook people) seems to have intended. In a sense, this kind of metaverse is an extension of massive multiplayer gaming, and that similarity illustrates the fact that the concept of the metaverse as a platform differs from the concept of the metaverse as a service or social network.
As a platform, a metaverse is a digital twinning framework designed to mimic selected elements of the real world. We could envision a metaverse collaborative mission as one that fairly faithfully mimicked the subset of reality needed to give people a sense of communicating in a real get-together. A game is a mission where only the “identity” of the player is mimicked; the persona the avatar represents isn’t tightly coupled with the real physical person except perhaps in movement, and perhaps not even there. Maybe it’s only a matter of control. As I suggested in earlier blogs, you could also envision an IoT mission for a metaverse, where you digitally twinned not necessarily people but transportation or industrial processes.
What we’re seeing already is an attempt to link blockchain concepts, ranging from NFTs to cryptocurrencies, to a metaverse, then say that anything that involves blockchain or supports crypto is a metaverse. Truth be told, those are more readily linked with the Web3 concept, where identity and financial goals are explicitly part of the mission. That doesn’t mean that you couldn’t have blockchains and crypto and NFTs in a metaverse, only that those things don’t make something a metaverse.
So what does? An architectural model, I think, made up of three specific things.
The first is that digital-twin concept. A metaverse is an alternate reality that draws on real-world elements by synchronizing them in some way with their metaverse equivalent, their twin. Just what gets synchronized and how it’s done can vary.
The second is the concept of a locale. Artificial reality has to reflect the fact that people can’t grasp the infinite well. We live in a vast world, but we see only a piece of it, and what we see and do is contained in that piece, which is our locale. We can define locales differently, of course, a video conference could create a metaverse locale, but a locale is fundamental because it’s the metaverse equivalent of the range of our senses. This means, collaterally, that the metaverse may have to generate environmental elements and even avatars that don’t represent an actual person but play a part—the Dungeons and Dragons non-playing character or NPC.
The third thing is contextual realism. The metaverse isn’t necessarily the real world, or even something meant to mimic it, but whatever it is, it has to be able to present in a way that matches the experience target of its mission. If we’re mimicking a physical meeting, we have to “see” those in our meeting locale and they have to move and speak in a way consistent with a real-world meeting. If we’re in a game and playing a flying creature as our avatar, we have to be able to impart the feeling of flight.
I think that it would be possible to create a single software framework capable of supporting any metaverse model and mission, given that we could define a way of building a metaverse that could provide the general capabilities required for the three elements above. However, the specific way the architecture would work for a given mission would have to fit the subjective nature of metaverses; what makes up each of my three things above will vary depending on the mission.
A good example of this is how a metaverse is hosted. Just like a cloud-native application is a mesh of functions, a metaverse is likewise. The place where a given function is hosted will depend on the specific mission, and in fact some functions would have to be hosted in multiple places and somehow coordinated. For example, a “twin-me” function that would convert a person into an avatar would likely have to live locally to each person. I also speculated in a past blog that a “locale” would have to have a hosting point, a place that drew in the elements of all the twin-me functions and created a unified view of “reality” at the meeting point.
Blockchain, NFT, and crypto enthusiasts see the metaverse as a GPU function because GPUs create blockchains. I think that this focus misses all three of my points of metaverse functionality because it misses the sense of an artificial reality. The limit to a metaverse is really the limitation of creating a realistic locale, and the biggest barrier to that is probably latency, because latency limits the realism of the metaverse experience.
We could envision a collaborative metaverse with ten people in it, with all ten being in the same general real-world location. We could find a place to host our locale that would present all ten with a sense of participation, providing our “twin-me” functions were adequate. Add in an 11th person located half-a-world away, and we would now have a difficult time making that person feel equivalent to the first ten, because anything they did would be delayed relative to the rest. They’d “see” and “hear” behind the other ten, who would see and hear them delayed from their real actions.
This doesn’t mean that there’s no GPU mission in metaverse-building. I think that the twin-me process could well be GPU-intensive, and so would the locale-hosting activity, because the locale would have to be rendered from the perspective of each inhabitant/avatar. The important thing is that contextual realism, which GPUs would contribute to but which latency would tend to kill. Thus, it’s not so much the GPU as where you could put it, particularly with regard to locale.
Everyone virtually-sitting in a virtual-room would have a perspective on the contents, and of each other. Do we create that perspective centrally for all and send it out? Not as a visual field unless our metaverse is very simple, because the time required to transmit it would be large. More likely we’d present the room and inanimate surroundings as a kind of CAD model and have it rendered locally for each inhabitant.
This sort of approach would tend to offload the creation of the metaverse’s visual framework from the locale host to the user’s location, but I think that business uses of a metaverse are likely to have a “local locale” host to represent their employees. That means that metaverse applications would be highly distributed, perhaps the most distributed of any cloud applications. It also means that there would be a significant opportunity for the creation of custom metaverse appliances, and of course for edge computing.
The connection between metaverse visualization and metaverse “twin-me” and gaming is obvious, and I can’t help but wonder whether Microsoft and Sony used that as part of their justification for buying gaming companies. However, there’s a lot to true metaversing that simple visualizing or twinning doesn’t cover. Microsoft, with Azure, has an inroad into that broader issue set, and Sony doesn’t. They may need to acquire other services, which begs the question of who will offer them.
The metaverse concept is really the largest potential driver of change in both computing and networking, because the need to control latency to support widely distributed communities would tend to drive both edge computing and edge meshing. That would redefine the structure of both, and open a lot of opportunities for cloud providers, vendors, and even network operators down the line. And perhaps not that far down the line; I think we could see some significant movement in the space before the end of the year.