Though it’s often painted as a technology that will steal XR’s thunder, AI is the best thing that’s happened to XR in a while. AI can be an intelligence backbone for XR, while XR can be the front-end graphical interface for AI. The two technologies complete each other in lots of ways.
That goes for user-facing XR experiences, such as generating world-immersive overlays and lenses on the fly through text prompts. And it applies to creator-facing work, such as streamlining workflows through the time-saving and capability-boosting power of generative AI.
Both angles were recently advanced by Snap. In Lens Studio 5.0, it launched GenAI suite. This gives lens creators generative AI tools to do things like rapidly prototype 3D designs using text prompts. It also teased the other angle noted above: user-facing generative lenses.
The convergence of these technologies was also explored in a recent report by our research arm, ARtillery Intelligence. As such, it joins our weekly report-excerpt series, with the latest below. This week’s segment zeroes in on the high-level aspects of AI & AR’s collision.
Applications & Endpoints
Gen AI is the new black, ranging from early leaders like Midjourney to recent entrants like Meta Imagine. Generating media with text inputs seems to be blowing everyone’s minds. Similarly, conversational AI like ChatGPT continues to preview the future of information retrieval.
Some have even called this the death of the Google-dominant era of search. Though that could be overblown, there’s ample promise in the applications and endpoints for technology from the likes of OpenAI. This competitive landscape will explode in the coming years, and already is.
Amidst all that excitement, one question we’re often asked is how generative AI will converge with XR. And though it’s early in the conceptualization of generative AI, we already see several ways that the technology could merge XR to engender new functions, use cases, and points of value.
These possibilities are split between user-facing and creator-facing. Starting with the former, gen AI could be used to create serendipitous AR experiences on the fly, based on voice prompts. This contrasts the ways that XR is experienced today which is mostly preordained.
In other words: AR experiences and interactions are deliberately and painstakingly developed in advance, then experienced when activated in deliberate use cases. AI-generated XR – let’s call it “generative XR” – could be more open and flexible in creating experiences dynamically.
Similarly, with the help of generative AI, AR could evolve into more of a utility. Like search, information could be pull-based. Users could activate and observe 3D animations on demand, with possibilities from entertainment to “how-to” content (e.g., tying a tie).
Ideation & Inspiration
As for creator-facing features, generative AI could accelerate developer workflows, as Snap has done with GenAI Suite. This includes generating 3D models that are the ingredients for XR. Such outcomes would fill a key gap in XR’s value chain as 3D content is currently a bottleneck.
Think of this sort of like a creative assistant. During creative workflows, AI can be utilized to build rote elements of 3D environments, such as backgrounds or placeholder assets. This is similar to Meta’s concept of a ”builder bot” for 3D worlds, which doesn’t exist but could be a model or target.
Beyond AI’s role as a creative assistant, some AR creators already use it as an inspirational tool. As part of their ideation process, they prototype or storyboard AR lenses by generating artwork through tools like Midjourney. This can help them refine ideas or spark creative directions.
Other possible avenues include utilizing conversational AI (e.g., ChatGPT) for intelligent assistant functions in AR & VR headsets. Indeed, inputs that control these devices are still being conceived for AVP and others, so it could be an opportune time for intelligent voice assistants to emerge.
But all the above is admittedly early and speculative. Watching the convergence of AI and XR will be the fun part, which will take years to fully materialize. As we’ve learned from past tech cycles, the use cases and endpoints are discovered organically and slowly. So we’re still in chapter 1.
We’ll pause there and circle back in the next installment with more excerpts and insights. Meanwhile, see the full report here.