In the novel Accelerando by Charles Stross, the protagonist Manfred Macx is a power user of smartglasses. These glasses dramatically enhance his ability to interact with the world. Over the course of the book, we learn about the glasses’ incredible capabilities, from real-time language translation to dispatching agents to perform complex tasks across the internet. At one point, Manfred has the glasses stolen from him, and it becomes clear that they form a significant part of his cognition.

Offloading parts of your cognition to a device is already commonplace today. Many of us already rely on our smartphones for upcoming events, directions, reminders, etc. If you take the idea of having a “silicon lobe” to its logical conclusion, you end up at… a computer being physically integrated into your brain aka Neuralink. This is true in the novel too, with the glasses being “jacked in” to the brain directly, allowing for a higher bandwidth interface and direct sensory manipulation.

In the interim between the iPhone and Neuralink, I predict we will see some kind of Augmented Reality (AR) glasses hit the mainstream for consumers, similar to those in the book (minus the direct neural interface). This device may become as integral to your life as Manfred’s, where losing them is equivalent to a lobotomy!

This two-part series will provide a deep dive into the following:

  • What are AR glasses and their concrete use cases?
  • What are the key challenges facing AR glasses?
  • What will the first successful consumer product look like, and when will it arrive?
  • What are the key innovations required for the ultimate consumer smart glasses?

We’ll start here in Part I with AR glasses basics and use cases, along with some of their design challenges. Let’s dive in…

What are AR glasses?

AR glasses are a form of wearable technology that superimposes digital information onto the real world. While all AR glasses are smartglasses, not all smartglasses are AR glasses.

The 2024 Ray-Ban + Meta smartglasses are one of the first forays into the consumer market. I’ll be using them as a reference point throughout this post.

Notice how the arms are quite a bit thicker than standard sunglasses from Ray-Ban. These glasses pack in a huge amount of electronics to provide users with a rich “AI + Audio” experience. Using the integrated camera and spatial audio, you can ask questions, listen to music and use AI to perform other tasks. However, they’re lacking the crucial component of AR glasses: a display. Meta isn’t stopping there though, and is expected to announce Orion and Hypernova, their true AR glasses, at Meta Connect next month.

Ray-Ban Metas Steal the Show

Use Cases

In order to usurp the smartphone and become the primary interface with the digital world, AR glasses must provide a significant boost in utility. The use case that everyone is talking about today is AI integration, and for good reason. In my opinion, every human could get a staggering amount of utility from having a SOTA LLM deeply integrated into their lives. These LLMs are already proving incredibly useful in chatbot form, however they’ve quickly become bottlenecked – not by IQ, but by context and bandwidth.

  • Context: See and hear everything the user does (privacy and regulation notwithstanding, Zuckerberg highlighted this use case at Meta Connect 2023).
  • Bandwidth: Higher information transfer between the user and the computer (the average person types at a shockingly slow ~40wpm).

Perhaps the latest wave of AI innovation will be the catalyst for AR glasses to hit the mainstream. To see what the glasses might be capable of, let’s analyze some use cases.

1. Contextual Overlay

The obvious use case for AR glasses is providing updates and real-time information in a timely and contextual manner. Microsoft has achieved a modicum of success in the enterprise AR market with its HoloLens 2, which is primarily targeting manufacturing and military applications. The glasses provide real-time, hands-free instructions to workers, allowing them to carry out routine tasks more efficiently. Despite being relatively absent from the public consciousness, the HoloLens line has sold more than half a million units since its release.

2. Sensory Control and Manipulation

Whilst we won’t be able to directly interface with our auditory and visual systems for at least the next 5 years, that doesn’t mean to say that AR glasses can’t give users a newfound control over their senses. One of the most liberating things about the proliferation of high-quality Active Noise Cancelling (ANC) headphones is the ability for an individual to filter their audio input. The market has responded well to this ability, and in 2020, Apple’s revenue from AirPods sales was larger than the entirety of Twitter, Spotify, and Square combined.

What would this look like for visual data? Perhaps something reminiscent of “Times Square with Adblock on”.

Times Square Adblock

The ability to modify your visual input is a powerful one, and it could be used for more than just blocking ads. For example, during a conversation with a person speaking a different language, you could use a diffusion model to modify their lip movements to match your language. This, plus ANC and a speech-to-speech model, could be an excellent start to the dissolution of language barriers. In order to achieve both of these use cases, the AR would need to be extremely high-resolution and low-latency, which is a significant challenge we’ll discuss later.

3. Personal Assistant Functions

A person’s name is to that person, the sweetest, most important sound in any language.

— Dale Carnegie

Whilst this is perhaps a subset of “Contextual Overlay”, it’s worth highlighting, as it is a compelling use case far beyond what smartphones can provide today. By providing pieces of data throughout the course of a day, everyone could become more charismatic and thoughtful. You’ll never have to forget a name or a face ever again.

You can take this further too. According to Carnegie, Abraham Lincoln would dedicate considerable time to learning about the interests and backgrounds of people he was scheduled to meet. This allowed him to engage them in conversations about topics they were passionate about, greatly boosting his likeability. With AR glasses, everyone could do this effortlessly (much to the chagrin of those of us who have expended considerable effort practicing this). This is on top of the more obvious personal assistant utilities! However, for this to be totally effective, the glasses would need to serialize everything you ever see or hear, which is quite the social and regulatory challenge.

Other Uses

The outlined use cases above are just the tip of the iceberg. In Accelerando, the glasses also perform a variety of futuristic tasks, such as:

  • Dispatching AI agents to perform complex analysis across the internet
  • Direct neural interfacing for higher bandwidth communication
  • Personalized vital sign monitoring

Even without these capabilities, by simply combining contextual overlay, sensory control, and personal assistant functions, we can already see the immense value that AR glasses could provide to users within the next ~3 years. Let’s dive into the technical challenges that must be overcome to make this vision a reality.

Physical Constraints

In order to better understand the technical challenges, let’s first establish some axioms based on the physical constraints of the glasses.

Weight

There is surprisingly little literature on the acceptable weight of consumer glasses, so let’s look at some current products on the market.

Anecdotally, it seems like the Meta smartglasses are light enough for all-day wear, with the XREAL Air 2 Ultra not being too far off. Given the potential utility of the glasses, I think using ~75g as an upper bound is reasonable. This also aligns with Meta’s rumored Hypernova glasses we mentioned earlier, which reportedly sit at ~70g.

Volume & Battery Life

Another concrete axiom we can build off of is volume. The shape of the human head is going to stay consistent for the foreseeable future, and whilst there is some leeway to be found in the flexibilty of social dynamics, we can expect the form factor (and therefore volume) of the glasses to be approximately similar to glasses today. To calculate a baseline volume, I went to the closest demo location and measured the Meta smartglasses with calipers.

From my measurements, the volume of each arm is approximately 10𝑐𝑚310cm3, so 20𝑐𝑚320cm3 for both. If we naïvely used the entire volume of a single arm as a battery (and allocate the rest to electronics), and assuming we use a SOTA battery with 800Wh/L800Wh/L at a voltage of 3.7𝑉3.7V, we can calculate the maximum possible battery life of any smart glasses as follows:

800Wh/L×0.01L=8𝑊ℎ8𝑊ℎ÷3.7𝑉=2162𝑚𝐴ℎ800Wh/L×0.01L=8Wh8Wh÷3.7V=2162mAh

2162𝑚𝐴ℎ2162mAh is approximately half of the battery life of a modern smartphone, but is an order of magnitude larger than the battery that currently ships in the Meta smartglasses highlighted below (154𝑚𝐴ℎ154mAh). To put it simply, if we want 8 hours of battery life, the glasses are limited to 3600𝐽/ℎ3600J/h (1𝑊ℎ==3600𝐽1Wh==3600J).

Maximizing battery life is absolutely critical for AR glasses, and it’s part of what makes them such an enticing challenge. It requires a full stack solution, from efficient software to efficient hardware to efficient power management. From the above teardown, we can see that Meta hasn’t exactly maximized volume usage, so I expect that later generations of glasses could have significantly longer battery life. Perhaps like in airplanes and electric cars, we will see power sources being used as structural components in the glasses to increase volume utilization.

We’ll pause there and pick it up in Part 2 of this series by going deeper into AR’s technical challenges… and some future gazing. 

Christopher Fleetwood is a Machine Learning Engineer at HuggingFace.


More from AR Insider…