Google Lens continues to be a sleeping giant in AR. As a visual search tool, it can identify and contextualize items that you point your phone at. Carrying similar user intent as web search, Google is salivating over the idea of more high-intent query volume that flows from your camera.
This idea is built on a few premises. For one, it’s sometimes more natural to search visually than with text. This can be the case for fashion items you encounter in the physical world; or finding out more about a local storefront, such as a new restaurant that opened in your neighborhood.
The other factor is user intent. The above examples are very commercial in nature. Throughout the smartphone era, Google has demonstrated that proximity can boost commercial intent, such as “near-me” searches. But with visual search, subjects aren’t just nearby but in view.
All of the above sets the stage for visual search’s potential. Google’s current stage is to cultivate and condition user behavior before monetizing it. To that end, we’ve observed two big recent updates for Google Lens – both of which support shopping. Let’s take those one at a time.
Where Do I Buy This?
The first update is all about multimodal search. Google announced last month that Chrome browser users on iOS can search with a combination of text and images. This was already available in Google’s mobile app but now comes to the browser, making it more accessible.
This combination of inputs is known as multimodal search. We noted earlier that visual searches are often more natural than text. Multimodal search lets you combine them for even greater dimension. For example, you can use an image of a jacket and say “Where do I buy this?”
That brings us back to the point about high commercial intent – the part Google is keen to develop. In fact, it reports that 20 percent of all query volume on Google Lens have commercial intent. This usually involves scanning products and wanting to learn about them in some way.
Meanwhile, multimodal search is developing elsewhere. One of the biggest selling points – and points of popularity – for the breakout-hit Ray-Ban Meta Smartglasses is multimodal AI. Users can glance at any scene while wearing the glasses and say “Hey Meta, what am I looking at?”
Of course, Meta’s use case is more natural, as opposed to holding up your phone. This could be what it takes to boost acceptance and adoption. Another boost could come from Apple’s Visual Intelligence integration with the one-tap “camera control” button on the iPhone 16.
Fashion Discovery
Moving on to Google Lens’ second big update, it now offers product insights, price comparisons, and local availability when users scan a given product. Importantly these results can include the exact item being scanned, as well as visually-similar products for “fashion discovery.”
This builds on the statistic noted above that 20 percent of Google Lens searches have commercial intent. Users are scanning products they encounter, with the intention of finding out more about them or where to buy them. So with this move, Google’s meeting the demand signals it sees.
Search results are derived from a combination of Google Images, Gemini, and Google Shopping, including the Shopping Graph’s 45 billion product listings. It will specialize in beauty, toys, and electronics, including results from Target, Ulta Beauty, Macy’s, Norstrom, and others.
As for rollout, these shopping insights in Google Lens are available already in the Google and Chrome apps for Android and iOS users. For now, it’s only available in the U.S., and users have to turn on location sharing to use it. This also arrives just in time for the holidays.
Altogether, both of these recent moves from Google could accelerate demand for visual search, as they each venture to reduce friction and activation energy. But the ultimate endgame for Google is to monetize visual search, and these moves signal that it’s aiming in that direction.