AR comes in many flavors. This is true of any technology in early stages as it twists around and takes shape. The most prevalent format so far is social lenses, as they enhance and adorn sharable media. Line-of-sight guidance in industrial settings is also proving valuable.
But a less-discussed AR modality is visual search. Led by Google Lens and Snap Scan, it lets users point their smartphone cameras (or future glasses) at real-world objects to identify them. It contextualizes them with informational overlays… or “captions for the real world.”
This flips the script for AR in that it identifies unknown items rather than displaying known ones. This makes potential use cases greater – transcending pre-ordained experiences that have relatively narrow utility. Visual search has the extent of the physical world as its canvas.
This is the topic of a recent report from our research arm, ARtillery Intelligence. Entitled Visual Search: AR’s Killer App?, it dives deep into the what, why, and who of visual search. And it’s the latest in our weekly excerpt series, with highlights below on visual search’s drivers & dynamics.
One ongoing challenge in AR is to find killer apps. A possible candidate was hinted at a Google event last year. Though it since discontinued its ‘Project Iris’ AR glasses (with a rumored return), their focused use case at the time had validity: real-time foreign-language translation.
After ingesting and processing spoken audio, the device was proposed to display translated text in a user’s line of sight. This is something that builds on Google’s existing abilities with Google Translate. It’s also a highly utilitarian use case that applies to a large addressable market.
Panning back, language translation could represent something broader: captions for the physical world. It starts with language but could spawn identifying overlays for anything that can benefit from “translation.” We’re talking storefronts, restaurant menus, and fashion discovery.
And that all traces back to visual search and its many flavors explored throughout this report. That starts with handheld formats but will amplify in value and utility when it’s on your face. The vision (excuse the pun) is hands-free and AI-driven ambient AR, activated when it’s needed.
But to be fair, Google’s demonstration of AR language translation was a concept. And we always should be wary of concept demos. The underlying tech and practical realities (e.g., glasses that record audio) should be considered, along with several other technical challenges.
Deliberate & Refined
Technical issues aside, the other lesson in use cases like language translation is focus. Google learned from Google Glass that use cases should be deliberate and refined. There should be a single good answer to the question, “Why do I need this thing on my face?”
Whether it’s language translation or something else, AR’s eventual killer app will also likely carry a few traits including utility, broad appeal, and high frequency… not necessarily fun & games. Indeed, killer apps such as web search historically carry some combination of these attributes.
Unpacking that a bit, progression toward utility is an evolutionary arc that many tech products follow. It’s all about fun & games before settling into lasting value in everyday utilities. Consider the iPhone’s path from novelty apps like iBeer and Zippo to staples like Uber and WhatsApp.
The same happened on the web, where killer apps are decidedly mundane: search, email, news, etc. “Mundane” sounds like a bad word, but it’s not. Though not as sexy as the novelties that preceded them, such apps represent sustainable business models that are scalable.
AR, including language translation, could follow a similar path. But there’s still a ways to go in capability and cultural acclimation. Though it has ample appeal, AR is still a departure from deep-rooted consumer habits. And paradigm shifts like this always take time.