Segmenting the Spatial Spectrum: Android XR Edition

Last week at Google I/O, we finally learned more about the much-anticipated Android XR platform and its hardware partners. This was a milestone for the broader spatial computing world, with lots of renewed excitement and the latest reminder that AI will be a force multiplier for XR.

That last part is all about Gemini – the throughline for Android XR and its hardware endpoints. This is especially important for non-display smart glasses whose UX needs something other than visuals as a selling point. Annotating your world is that selling point (more on that in a bit).

But along with all the excitement was a bit of confusion and conflation in the device classes that Android XR will power. Though the platform will eventually span the full range of the spatial spectrum, the device prototypes Google showed last week sit at a few points along that spectrum.

So what are those devices, and what do they do? Let’s explore the three main device classes that Google paraded out last week. But before dialing in to those devices and their categorization, let’s level set on the broader XR spectrum and the various points along its continuum.

Will Google’s ‘AI Mode’ Accelerate AR?

Spatial Spectrum

To revisit our main categories of XR devices, below are definitions and delineations. These flow from the categories used by our research arm, ARtillery Intelligence. To stay organized and avoid double counting in market sizing, it helps to have firm boundaries around device tiers.

Video Passthrough (Handheld)

Attributes: Also known as Mobile AR, this category is defined by AR that happens on smartphones and tablets. It’s technically a form of video passthrough AR, as it augments content from the device’s camera feed. But the biggest distinction is that the form factor is handheld rather than headworn.

Examples: Snapchat Lenses, TikTok Effects.

Video Passthrough (Headworn)

Attributes: This device class includes VR headsets that have HD color passthrough cameras to view the outside world inside the device. That capability is the precursor for augmenting physical objects and scenes with digital elements. Also known as mixed reality, it has become a new standard in VR.

Examples: Meta Quest 3, Apple Vision Pro.

Non-Passthrough VR

Attributes: This device class includes VR headsets that don’t offer mixed reality, though they often have external cameras for motion tracking or basic passthrough video (user orientation). Experiences within the headset are mostly meant to be occluded, insular, and immersive, including gaming and entertainment. It can include 3DoF (head tracking) or 6DoF (positional tracking) experiences.

Examples: Valve Index, PlayStation VR

Optical Seethrough (Dimensional)

Attributes: This category includes AR glasses with see-through lenses on which graphics are projected. It comes in two flavors: dimensional and flat. Dimensional AR features digital elements that interact with their surroundings through spatial understanding (SLAM). This hallmark of dimensional AR is enabled by external cameras/sensors and software that understands and accommodates spatial geometry. Due to these technical requirements – in both hardware and software – devices in this category are generally bulkier and costlier than other classes.

Examples: Snap Spectacles, Meta Orion

Optical Seethrough (Flat)

Attributes: Similar to the previous category, this includes AR glasses with seethrough lenses on which graphics are projected. However, it differs in that graphics don’t have spatial understanding nor dimensional interaction with physical spaces. They’re rather flat or floating overlays such as virtual displays. Use cases include utilities, notifications, or private entertainment & gaming, mirrored from one’s smartphone, PC, or console.

Examples: Xreal One, Rokid Max

Non-Display AI Smartglasses

Attributes: This category includes glasses that have no display system at all. Experiential augmentation happens through audio cues. These devices apply AI to achieve a level of utility that compensates for the lack of visual experiences. For example, multimodal AI applies camera-based visual recognition along with voice refinements (e.g., “What am I looking at?”) to return audible answers. The lack of a display system meanwhile enables style and wearability. Devices are often sleek and light, resembling normal eyewear.

Examples: Ray-Ban Meta Smartglasses

Mapping Android XR

Now, with that broader definitional backdrop, what devices did Google preview last week, and how do they map to the above categories? Here they are in order of optical fidelity and complexity.

Moohan

Google’s project Moohan is a joint effort with Samsung and Qualcomm. It involves a Vision-Pro-like device that features highly-immersive and dimensional mixed reality, delivered via video passthrough. The device’s insular and occluded form factor also makes VR experiences possible.

As such, it will compete with Vision Pro and Quest 3, with a price tag somewhere between the two. This could be smart positioning as it occupies the meaty middle for enterprises and power users that want more functionality and fidelity – not to mention Gemini – without a Vision Pro price tag.

Display Glasses

The Android XR device class that got the most attention and airtime at Google I/O was flat AR glasses. The initial Samsung prototype doesn’t have the spatial awareness and dimensional scene interaction of Moohan, but it does boast a critical attribute for mass-market XR: style & wearability.

Joined soon by Xreal’s Project Aura, the Samsung prototype features a monocular heads-up display for utilities such as language translation and navigation. And Gemini will bring intelligence to the table, including Google Lens and other ways to annotate and enliven the world around you.

Non-Display Glasses

Though having the simplest feature set and no display, non-display smart glasses have been validated as the most mass-market-friendly XR class, thanks to Ray-Ban Meta Smartglasses. This means others will opportunistically rush in, including Apple (rumored) and Google (announced).

Like Meta, Google’s play involves fashion-brand partners – Warby Parker and Gentle Monster to start. Also like Meta, Google will position cameras to see the world (input) and audio (output) for multimodal AI, object recognition, and situational intelligence – all powered by Gemini.

Who Makes What?

So there you have it. To synthesize a few points, one question that emerges from the above is who makes what? Google is clear that it’s only providing the software and that partners will be enlisted to build the hardware… just like Android’s original positioning in the mobile realm.

But if we use Android as an example, it’s possible that in the future, Google will launch a Pixel line of XR devices. It does this today as a reference design, to show the mobile OEM world what the very best version of Android should be, and “what the artist intended” for hardware endpoints.

We’ll see if it ends up doing this with Android XR. In the meantime, there are patterns in its partnering strategy. For one, its partners will map to two categories: tech companies and fashion brands. The former will make anything with a display, while the latter handles smart glasses.

This comes down to capability. Display glasses will require the technical competency of the Xreals and Samsungs of the world… and others that join that list. Meanwhile, mass-market non-display glasses are in better hands with the Warby Parkers and Gentle Monsters of the world.

Arms Race

But with all of the above, there is one detriment: lack of vertical integration. Meta and Apple own the entire XR stack, rather than partnering (except for non-display glasses). The resulting sensor fusion and UX cohesion are more important in XR than in mobile (Android’s legacy model).

But Google does excel in another key category: AI. That brings us back to Gemini – Android XR’s biggest differentiator. It could give Google an edge in XR, given its inherent AI advantages (knowledge graph, etc.) over XR competitors such as Meta (though it’s no slouch) and Apple.

In other words, it’s likely that whoever wins the AI race also wins the XR race. A defining function for the next era of XR won’t just be visuals, nor scene recognition, nor all the other XR success factors to date. It will be a game won or lost on AI. And that’s good news for Google.

https://youtu.be/zx4pRnyeIyc?si=Po0LBryNbDyL-UFy

ARtillery Briefs, Episode 88: Segmenting the Spatial Spectrum (https://youtu.be/zx4pRnyeIyc?si=Po0LBryNbDyL-UFy)