AR comes in many flavors. This is true of any technology in early stages as it twists around and takes shape. The most prevalent format so far is social lenses, as they enhance and adorn sharable media. Line-of-sight guidance in industrial settings is also proving valuable.
But a less-discussed AR modality is visual search. Led by Google Lens and Snap Scan, it lets users point their smartphone cameras (or future glasses) at real-world objects to identify them. It contextualizes them with informational overlays… or “captions for the real world.”
This flips the script for AR in that it identifies unknown items rather than displaying known ones. This makes potential use cases greater – transcending pre-ordained experiences that have relatively narrow utility. Visual search has the extent of the physical world as its canvas.
This is the topic of a recent report from our research arm, ARtillery Intelligence. Entitled Visual Search: AR’s Killer App?, it dives deep into the what, why, and who of visual search. And it’s the latest in our weekly excerpt series, with highlights below on visual search’s drivers & dynamics.
Snap & Tap
Starting at definitional levels, what is visual search and what forms does it take? As noted, it lets users point their smartphones at physical objects to identify them. Under the hood, it applies computer vision and machine learning to see and contextualize what your camera sees.
Use cases include education, travel, and shopping. Students can identify landmarks, while travelers discover and navigate new locales. Socialites can discover new bars & restaurants. And fashion acolytes can identify apparel they encounter IRL. It brings intelligence to all the above.
Beyond use-case value is the fact that these are high-frequency activities. Like web search, visual search’s utility could emerge throughout a given user’s day. This engenders stickiness and repeat use… which can’t always be said for AR lenses whose value lies in novelty or whimsy.
Beyond utility, range, and frequency, visual search’s value stems from another factor that makes web search so valuable: intent. The lean-forward act of identifying objects visually can signal high buying intent when done in commercial contexts such as fashion and food discovery.
Inside Track
These are all reasons Google is keen on visual search. Along with voice search, it could help the search giant future-proof its core business. This factor is amplified by gen-Z’s affinity for the camera, and its spending power as the generation phases into the adult consumer population.
To that end, Google continues to invest in developing and accelerating visual search. For example, a Google Lens button now sits prominently next to the search bar on Google’s mobile and desktop interfaces. This is prime real estate to drive exposure and adoption of any product.
Google meanwhile has an inside track. As the world’s search engine for 20+ years, it has a knowledge graph and image database to fuel AI training sets for object recognition. Snap (Scan) and Pinterest (Lens) are also primed for high-value use cases like fashion and food discovery.
But though all the above adds up on theoretical levels, there are barriers in practice. These include computational challenges as well as behavioral and cultural impediments. Visual search deviates from deep-rooted consumer habits, so it may take a while to reach its adoption potential.
We’ll pause there and circle back in the next installment with more visual search analysis and examples.