Defining the AR & AI Battlefront, Part I

I’ve been getting a lot of questions about Meta’s in-development AR glasses, Orion. What are the implications? Why did Meta show its cards? Who is going to win? Most importantly, will anyone even care?

Meta revealed Orion at Meta Connect, its developer conference in September. It was a provocative big tech flex, a proper ‘I told ya so’ to the Wall Street haters, and a bold warning shot over Apple’s bow.

Real bold.

It was also the first time the world has ever seen AR glasses worthy of mainstream adoption (and worthy of our sci-fi imaginations). What was deemed impossible with today’s tech was proven possible, moving consumer-grade AR from sci-fi to sci-fact.

And now… the (real) spatial computing race is on. Everything up until now was a warm up (yes, including the Vision Pro).

With the dust settled, here are my unfiltered thoughts and predictions from the spatial computing front lines.

In Part I, we’ll hit the three major implications: market timing, ecosystem acceleration, and the emergence of AR’s budding romance with AI.

This sets the stage for the rest of the series, where we’ll predict a winner by analyzing the 7 key battlefronts: hardware, software, distribution, AI, developers, philosophy, and leadership.

From there, I’ll probably make this an ongoing series and include the additional players, checking in along the way to see where I was right, wrong, and/or entirely delusional.

Regardless, two things are certain.

First, this is going to be one of the most fascinating tech battles of our time, with the potential to not just transform, but also heal our relationship with technology.

Second, after 8 years in this space, I can finally say with confidence: the time to build is now.

Here’s why…

Timing

For the first time in history, we have a timeline for consumer-grade glasses.

What’s consumer-grade? It’s a Goldilocks zone between numerous variables; the combo of which is the spatial computing equivalent of Elon landing a rocket back on Earth.

Without getting too technical, here are the (oversimplified) big ones:

You can actually see the real world: it’s not pass through mixed reality like the VisionPro, in which the world is displayed through a screen (aka MR). The display is embedded into lenses, allowing you to see the real world and make real eye contact).
It’s not too heavy: Industry consensus is that AR glasses need to be under 100 grams. Orion snuck just under at 98. For comparison, VisionPro is around 600 grams, and Snap’s new AR glasses are 226 grams…
The field of view isn’t tiny or obstructed: Orion has a 70-degree field of view. That probably means nothing to you, but without getting too technical, just trust me: it’s a very big deal, and very impressive with this smaller form factor.
They don’t look ridiculous: they look/feel like a pair of glasses you’d wear in public without a sense of insecurity. They’re relatively sleek, with a hint of style (why Meta’s RayBans/Luxocitta partnership is underrated).
A reasonable battery life: you don’t have to recharge every 2 hours (albeit, this is the main thing that remains to be proven by Orion).
The input is natural: you can use your hands & eyes to more naturally and intuitively interact with the content (requiring all kinds of advanced sensors and computer vision software). No hand waving, no controllers, no external tracking systems.

I haven’t tried the glasses myself yet, but in talking to people who have, all the boxes above have been checked, except for one: battery life.

I’ll take it. Of all the technical challenges to address, this one is the most manageable, requiring the least amount of invention.

Pre Orion, most experts thought this Goldilocks zone was numerous technical breakthroughs away, needing new science and inventions across displays, batteries, cooling, etc. Many experts thought it could be 10+ years, others said 20, and some wondered if it would ever happen at all. I was in the 8-10 year camp. But who am I kidding: that was pure hope. The horizon stretched as far as the eye could see, and I must admit, my faith was wavering.

Post Orion, my faith has been restored. Land ahoy! You might have to squint, but it’s there, and unlike VR, this market won’t be niche…

VR is a ‘destination technology’. You have to set aside time to disconnect and go ‘elsewhere’. And if the VisionPro taught me anything, it’s this: it doesn’t matter how incredible the experience is, most people just don’t want to get fully immersed.

AR glasses, on the other hand, are an ‘always on’ technology, and at a minimum, the ultimate smartphone complement; paradoxically allowing technology to fade away into the background.

If done right, it will bring our eyes back up to the world and to each other, while conforming to how we already operate and exist in a 3D/spatial world, not the other way around (i.e. staring down into a screen, doom scrolling, pitter-pattering with our thumbs).

I know what you’re thinking… let’s not count our golden geese; it’s still TBD when Orion can be productized and manufactured at scale.

True, but that’s not the point. The point is we now know the tech exists, and we’ll be getting the iPhone 1 of AR within the next 5 years.

The implications? Historical investments in 3D/XR will be fortified, future investments are going to accelerate, and the ecosystem is going to get a second (third?) lease on life.

Ecosystem Acceleration

We’ve felt this acceleration/hype before, but it lacked a critical tailwind: real competition.
Meta has forced Apple’s hand, and now, the gloves are off between two of the most formidable companies of all time.

Meta’s also forced the hand of its employees. An internal fire has been lit and the pressure is on to ship. No more hiding behind the R&D veil.

We all know what this means: better products, better developer programs, better marketing, better sales motions, and increased investment writ large. Most importantly, a roadmap with incentive to build.

At last, the spatial money train is a comin’, and not just into R&D.

Some might say this effect already happened with the Apple Vision Pro (AVP). But I’d disagree.

First, as mentioned, the AVP is not real AR, and while technically superior to the Meta Quest 3 on almost every dimension, it’s not a ‘market-creating’ product. It’s a ‘me too’ product in a niche category.

Second, the AVP was great marketing and a huge credibility boost for our industry. But, despite my overall Vision Pro bullishness, I have to admit: the excitement and momentum fizzled almost as quickly as it began, including my own.

Even Apple’s follow-through felt half-assed, from the content to developer enablement, to enterprise go-to-market. Sure, they’re doing these things in spurts, but these motions don’t have the gusto you’d expect from a company like Apple.

In Apple’s defense, I don’t think they’ve fumbled and lost their mojo. It’s likely calculated. As I stated in the essay ‘The Fate of the Vision Pro,’ AVP is a ‘get-in-the-game’ strategy; a mere dipping of the toe and a mechanism for learning and inspiration (for both Apple and their customer). The AVP is the Lisa to their future Macintosh: more art & marketing than product.

Zuck in a Jiu Jitsu tournament. Photo Credit: @Zuck

I think this was the right approach, but also ironic, as the Vision Pro might have benefited Meta more than Apple.

Zuck is bringing his Jiu-Jitsu skills into the corporate realm, and using the AVP hype to his advantage.

How? By throwing a more practical and established sail into the AVP tailwind, kicking off a potential leapfrog effect. That sail being a cheaper yet comparable device (Quest 3), a broader dev ecosystem, better developer economics, product-market fit with the RayBans + AI form factor, existing enterprise penetration, etc.

We’ll dive deeper into their strategy in Part II. For now, the main point is this: the big tech equivalent of Godzilla vs. Kong is underway, and we all get to benefit as a result.

AI + AR

And finally, the biggest reason to start building now: the romance between AR and AI; a classic case of “you complete me”. Because without AI, AR just can’t become the best version of itself.
Why? Because if you’re going to ask people to put a computer on their face, there has to be a really, really good reason.

In the enterprise, those reasons are aplenty. But for your average consumer? It will have to yield something 10x more magical, useful, and compelling than the rectangle in their pocket.
Fortunately, AI will do exactly that.

The power of GenAI is mind numbing. But when I use it, I can’t help but think: this simple text box UI is f’ing limiting, especially for consuming GenAI’s outputs.

To be sure, the sentences, the images, and the videos; they’re a mesmerizing portal into our sci-fi future. But as I stare deep into it, I find myself yearning for deeper access to the abyss of knowledge bubbling, frothing, and heaving behind the 2D veil.

AI is capable of showing us so much more, across so many dimensions, with so many different mediums. And it’s capable of doing so in a more intelligent way, a more… generative way, with just the right amount of information, in the right place, at the right time.

This is the idea of a generative UI, one that adapts to what we’re doing, where we are, who we’re with, and what we’re seeking to understand. It’s the right UI for the task at hand. It’s also the right UI for AR glasses.

All of this becomes possible in the age we’re now entering: the age of ‘agentic’ applications; apps in which AI agents work in concert to do things on your behalf (i.e. no more swiping, tapping, and scrolling to get shit done. Things just get done for you via basic commands and contextual awareness).

Why get lost in a screen and a medley of app icons, lists, and buttons… when with a quick verbal prompt, I can cut straight to the core information I need for the task at hand, see some visuals or a list of final recommendations, and be on my merry way?

Or… based upon additional context cues and semantic awareness, AI agents are anticipating and showing me exactly what I need before I even know it?

This marks a notable shift in the software-hardware relationship, moving it from a push effect to more of a pull.

Meaning… for decades, we’ve crammed software into boxes based on the limits of chips, displays, memory, etc. These constraints have led to human-computer interactions that, while powerful, often require a steep learning curve to extract maximum value from the machine (e.g. having full agency in the digital realm; writing code, creating content, analyzing data, being productive… not just watching TikTok).

Now, innovation at the software layer is creating a pull effect at the hardware layer, demanding something entirely new. We have software approaching our equal, worthy of eyes, ears, mobility, and semantic understanding. Worthy of perhaps the greatest human ability at all; storytelling.

Storytelling in context, at its fullest capacity (with visuals, simulations, data, etc), and in all its forms; for knowledge transfer, for inspiration, for empathy, for meaning, for preserving the past, for imagining a better future.

I don’t think we’re even close to wrapping our heads around what this means… especially for the moment when AI comes to know us better than we know ourselves, both mentally and physically. This is coming via the combo of biometric sensors + AI, and in the not-too-distant future, brain-computer interfaces + AI + AR, aka the complete merging of man and machine, aka the singularity moment.

While it’s hard to believe, most of us will see this within our lifetime. Don’t believe me? Check out this recent podcast episode with the CEO of uCat, Sam Hosovsky: who’s building ‘mind-controlled avatars’ and various missing ingredients up and down the tech stack.

But I digress, this is an entirely different topic for a future essay.

Conclusion

Coming back to the present, the immediate ‘so what’ is this: startups and developers are re-energized, enterprise customers are doubling down, and investor confidence is on the mend; largely because the right destination is in sight.

This effect is most tangible in the enterprise: the first and most important battleground, as this is where most consumers will have their first really good experience with AR (just like in the phase between mainframes and personal computers).

While enterprise AR/VR has struggled to achieve meaningful scale, the savvy executives didn’t throw in the towel.

They realized AR/VR headsets were just one end-point of many, and that interactive 3D can be impactful with traditional interfaces: browsers, tablets, laptops, etc.

One of my customers has a very succinct way of explaining why: 3D just provides a more visual way of working. It gives users a better understanding of complex things and maximizes the bandwidth of communication between experts and laymen. The net result is better/faster decision making, which creates real business impact, and in some scenarios, saves lives (e.g. training in high-risk environments or quality control when building/planning dangerous facilities).

And this is just the ‘human in the loop’ benefit… These customers are also realizing that real-time 3D is a massive unlock for training AI to understand, simulate, and navigate the real world (e.g. teaching robots to navigate a factory, using AI to optimize numerous assembly line configurations).

These steadfast execs are now gaining internal credibility and unlocking investments in foundational 3D content platforms, complete with the back-end plumbing, the talent, and the workflows to make 3D easier to use across the business (what I call democratizing access to 3D).

And now, with AI + AR on the horizon, those bets are poised to really pay off. Enterprises can continue investing with confidence, knowing they’ll have the right foundation to triple down on when glasses arrive. The companies doing this will be miles ahead of the competition. Those ‘3D apps’ (and VR/MR apps) will slot seamlessly into glasses and AR will feel like an ‘overnight success’.

This is not dissimilar to what we’re seeing in AI. Many companies have decided the risk of underinvesting is much greater than overinvesting; a classic case of paddling well before the wave. If you placed your bets a bit early, you can always play the longer game and let the market catch up. Playing catchup when the wave is beneath you is much harder, with much more dire consequences.

Companies that start building the bridge today between 3D and AR will have a major advantage throughout the 2030s. Fortunately for all of us, I’m seeing many of the largest, most advanced companies in the world starting to do exactly this.

For those enterprise that haven’t started, fret not. You have at least 3+ years. There’s plenty of time to get those 3D-first backends in place and 3D workflows ironed out.

But the time to start is now…

Stay tuned for Part II, where we’ll start to analyze the 7 key battlefronts in this race: hardware, software, distribution, AI, developers, philosophy, and leadership.

Editor’s note: This article was written before Google’s recent Android XR announcement, which the author intends to cover in future articles…

Evan Helda is a writer, podcaster, and 8-year spatial computing veteran. A version of this post originally appeared on MediumEnergy.io, a newsletter exploring spatial computing, AI, and being human in the digital age (subscribe here). Opinions expressed here are his, and not necessarily those of his employer.