Google I/O: The AR Angle

As you may know, Google held its annual I/O developer conference this week. And it was in physical form for the first time in a few years. Returning to Shoreline Amphitheater for its signature event, it rolled out the standard slate of search, mapping, and AI updates.

Mixed in was some love for AR – a technology for which Google has shown varying levels of commitment over the past half-decade. This includes updates to visual search (a cousin of AR), as well as Google’s rendition of “one more thing” magic for its teased AR glasses.

To zero in on these AR highlights from I/O and to break down their strategic implications, we’re featuring the keynote for this week’s XR Talks (see video and narrative takeaways below).

The AR Space Race, Part I: Google

CTRL+F

Starting with the visual search updates noted above, Google announced “Multisearch Near Me.” This isn’t inherently AR but is closely related and lays a foundation for relevant AR experiences in the physical world (a.k.a, the metavearth). This is an area of AR that could birth killer apps.

Backing up, what is multisearch? In short, it involves blending various inputs (text, voice, visuals) as formats for search queries. So rather than just text, you can search using images. This is done with image files or a live camera feed to search for visually-similar items or products.

This is also the idea behind visual search, a la Google Lens. Point your phone at objects to identify or shop for similar items. Google calls this “search what you see,” and positions it as having a CTRL+F function – the keyboard shortcut for on-page searches – for the physical world.

Meanwhile, a similar value proposition lies in the under-utilized Google Live View. This lets users generate 3D walking directions in Urban Areas, including dimensional AR overlays that guide them and replace the common overhead mapping paradigm. It’s still early in development.

Back to multisearch, the “multi” part comes into the picture in blending several formats. For example, start a search for a new jacket using images or Google Lens, then narrow down results with text (e.g., “the same jacket in blue”). This could have lots of applicability to shopping.

Who Will Build the Metavearth?

Local Flavor

This all leads up to “Multisearch Near Me.” If multisearch and “near me” local business searches had a baby, that’s essentially what was introduced at I/O. It lets users search using images or screenshots along with the phrase “near me” to see results for related local businesses.

So if you see a pair of sneakers on the street or an enticing dish on Instagram, snap a pic and use it to search for retailers or restaurants that offer similar fare. That series of steps is a bit roundabout and therefore may be an adoption impediment, but it’s just the beginning.

For example, all the above will evolve in a few ways. For one, Multisearch Near Me will expand beyond single items to entire scenes. Users will be able to choose multiple objects within an image or frame to launch searches for related items or see identifying overlays.

This could all converge at some point with something we’ve noted above: Google Lens. The beauty of Google Lens is that visual searches happen in real-time as you pan your phone around. Search activation dots pop up on your screen where you can launch a visual search.

This real-time approach offers a more seamless UX, as opposed to screenshots and image selection. In other words, when Multisearch Near Me comes to Google Lens’ – with a handoff to Google Live View for last-mile navigation – it will be a more elegant way to find things locally.

Who Will Build the Metavearth?

Captions for the Real World

While we’re speculating, the longer-term convergence for all the above involves hands-free visual search. In other words, AR glasses….which Google also teased (see video above). In fact, an adoption barrier to Mobile AR is arm fatigue and social awkwardness from holding up your phone.

Of course, AR glasses come with their own dose of social awkwardness, as seen in Google Glass. A few things may be different now for Google AR glasses: underlying tech and use cases. The former has advanced closer to the realm of socially-acceptable, Wayfarer-like, smart glasses.

Meanwhile, the use case (why should I wear these things?) is a lesson Google learned from Glass. That’s why its glasses are now positioned for a more practical, sober and focused use case: real-time language translation. And Google can pull it off given Google Translate.

In fact, we’ve examined this use case in the past as being a potential killer app. That term is thrown around loosely, but real killer apps are utilities that solve problems for a large addressable market. Real-time language translation, or “captions for the real world,” check those boxes.

Again, much of this is speculative, but the dots are all there. Of course, take all of this with the appropriate salt tonnage given the fickleness of style and culture when it comes to tech that goes on your face. This is a category where good tech and design don’t guarantee success.

We’ll pause there and cue the full keynote video below. Use chapter markers to jump around…

https://youtu.be/nP-nMZpLM1A

Google Keynote (Google I/O ‘22) (https://youtu.be/nP-nMZpLM1A)