
Visual search is a flavor of AR that lets users search what they see. Using tools like Google Lens, they can scan or take pictures of physical items to contextualize or identify them. This can be useful for everything from food to fashion, and in use cases from education to shopping.
That last part has not only attracted Google but another likely player: Amazon. Last week, it built on its existing Lens tool with Lens Live. This adds a real-time component by letting users point their cameras at physical items to activate dynamic product information and overlays.
Specifically, users can open Lens Live and scan their physical surroundings. As searchable items are identified, they become clickable with little dots. Doing so activates a carousel – powered by Amazon’s Rufus – where users can add visual results to their Amazon cart or wish list.
Prior to this update, the process involved taking a picture from within the Amazon app, uploading it, then waiting for visual results to process. With Lens Live, descriptive overlays are activated as objects come into view, thus lessening steps for immediate gratification and fewer finger taps.
2025 Predictions: Visual Search Moves Closer to the Mainstream
Surface Area
These real-time UX updates could lessen the friction that stands as an adoption barrier for visual search. Its value proposition is otherwise strong, as visual search checks several boxes for killer app status, including high-utility and potentially high-frequency (just like web search).
Visual search is also natural for objects that are easier to scan than describe in words. That includes anything that’s visually complex, such as (again) fashion, food, and furniture. Another common trait in these examples is that they’re all products… which makes them monetizable.
To go deeper on that last point, the smartphone era showed that searches in proximity of a given subject (e.g., “Starbucks near me”) indicate high intent and thus CPC value. Visual search subjects aren’t just in proximity but in view, while offering more dimension than text inputs.
That last part is the reason that Google, Amazon, Snap and Apple are investing in visual search. Adding to the competitive field, one of the most popular features in the breakout-hit Ray-Ban Meta Smartglasses is multimodal AI, which is basically visual search (more on that in a bit).
Lastly, visual search is well-timed for the AI era as it involves lots of machine learning. As players like Google invest in visual search to future-proof and add surface area to their core businesses, it aligns nicely with parallel investments to develop and advance AI across the board.
Activation Energy
So if visual search has all these propellants, why hasn’t it exploded? One answer is that it has… sort of. Google reports 20 billion visual searches per month on Google Lens, which makes it the most penetrated form of AR. But that compares with about 300 billion monthly text searches.
Put another way, it’s far from mainstream. And the reason traces back to something noted above: friction. There are too many steps to activate visual search on a smartphone, including tapping into apps and menus, then holding up your phone to scan or snap an image.
Amazon’s Lens Live reduces a few of those steps, but more streamlining is needed. To that end, the real force multiplier for visual search could be the ongoing rise of smart glasses. There, the UX can be more ambient and intelligent, thus eliminating all that problematic “activation energy.”
That brings us back to Ray-Ban Meta smartglasses (RBMS). It has validated this principle to a degree in the popularity of its multimodal search. Users can activate visual inputs that prompt audio outputs (hence, multimodal) to identify or contextualize real-world objects and scenes.
RBMS signals the start of visual search’s transition from handheld to headworn, and the signs are positive so far. If visual search is the killer app we believe it could be, smart glasses could be the vessel that unlocks its utility and mass-market appeal. But it could take a while.
