
The intersection of AR and AI will take many forms, as it’s been theorized and future-gazed throughout the XR punditsphere. One of the most promising visions that’s been speculated is the ability to generate AR experiences and interactions on the fly, using text prompts.
Bringing all the workings and wonder of generative-AI to AR, we call this generative AR. After several steps in this direction (more on those in a bit), Snap this week became the first major AR player to bring it to market. It lets end-users speak AR animations into existence.
Known as Imagine Lens, users can enter prompts to create custom lenses (e.g., “turn me into a superhero in a retro-comic animation style).” Users can also progressively refine the lens or start over by entering more prompts (e.g., “now apply a more dark & ominous animation style.”).
To pull this off, Snap employs a custom mix of in-house AI models and industry-leading models. And though it has already integrated generative AI in various places, this is the first manifestation of open-ended text-to-image AR lenses where end-users are in the driver’s seat.
Snap Takes a Step Closer to Generative AR
Historical Context
Speaking of the generative AI that Snap has previously developed and integrated, let’s contextualize this week’s play within the scope of those recent moves. The goal there is historical context of Snap’s AI evolutionary path so far, and to triangulate where it could go next.
To that end, one of Snap’s first public declarations of its ambitions towards generative AR – though it didn’t use that term specifically – was at AWE 2024. In fact, we were on stage as Snap co-founder and CTO Bobby Murphy discussed the vision for dynamic text-to-image Lenses.
But rather than just future-looking platitudes, Murphy had some red meat to offer. As an evolutionary step towards user-facing generative AR, he announced Lens Studio 5.0, including the Gen-AI suite. This was the foundation for a set of AI tools to automate lens creation.
Those tools have since evolved rapidly in Lens Studio, including Easy Lenses and others that let creators automate 3D elements in their lenses with text-to-image generation. Put another way, Snap’s first moves into generative AR were creator-facing, rather than user-facing.
In the subsequent months, Snap rolled out products that worked towards user-facing generative AR. For example, its AI Video Lenses interact dynamically and dimensionally with scenes. 3D elements are still pre-ordained, but apply AI on the fly for serendipitous interaction.
XR Talks: What Sits at the Intersection of AR and AI?
Back to the Present
Fast forward through a few other AI integrations and evolutions – including Sponsored AI Lenses – and we’re back to the present. To make good on Bobby Murphy’s projections from the AWE stage 15 months ago, Snap has now fulfilled its promise to bring generative AR to market.
Snap is the logical candidate to have done this, and to continue developing it. The company has doubled down on AR as a cornerstone of its product and revenue model. This includes large-scale AR glasses investments, which will take a big step towards consumer markets next year.
That commitment has been reinforced by results. Snap now achieves 8 billion lens engagements per day – the equivalent of more than one lens play per human on the planet per day. And Meta has left an opportunity gap by abandoning its social lens platform, Spark.
As for what’s coming next, we envision Snap’s work at the intersection of AI and AR to align with its longer-term headworn ambitions. In other words, lenses that are ambient and intelligent utilities – such as fashion discovery or shopping aids – could be on brand and monetizable.
Meanwhile, back to the present again, Imagine Lenses aren’t yet available to everyone. Snapchat+ Platinum and Lens+ subscribers get first crack at it, as it often goes. Expect wider rollouts in the next year, as those power users uncover fitting use cases and product iterations.
