Over the last few days, I’ve had the opportunity to try Apple’s new Vision Pro headset. The flaws of this product are apparent to everyone: it costs too much, it’s bulky, its battery life is too short. Nonetheless I’ve walked away from my time with the product speechless, both worried about whether and how this technology will be abused and thrilled at the new things that are (or will be) possible.
First, some basics: The Vision Pro is a computer you wear on your face. When you wear it, you begin by seeing the world around you, but it is not transparent. Everything you see is a video projected onto two postage stamp-sized displays with about 11 million pixels each (far exceeding a standard 4K TV) from a network of cameras and other sensors mounted around the device. It’s a good enough imitation of reality, but it is far from perfect. In low light, it becomes apparent that you are viewing the world through cameras: the video becomes grainy, the colors muted. Even in strained conditions, though, the video is good enough that I have not once experienced sickness or headaches—common problems with other virtual reality headsets.
You use Vision Pro primarily with your eyes and hands. To click a button, you simply look at it and pinch your thumb and index finger together. To move a window, you simply “grab” it using the pinch gesture and move your hand in the direction you want to move the window—up, down, left, right, but also toward and away from yourself. The windows remain fixed in place, as though mounted. You can get up from where you’re working, go to another room, and your windows will be exactly as you left them. Similarly, you can put windows in specific places—a timer over the oven, for instance, and they will remain there until you either move them or turn off the device. You can augment your hands and eyes with a Bluetooth trackpad or keyboard; I highly recommend this if your intention is to do productive work.
One way that the Vision Pro achieves its immersion is through outstanding audio. Two small pods mounted next to your ears serve as speakers. The device uses its detailed understanding of the room you are in, including the acoustic qualities of the materials it detects (glass, curtains, exposed brick, drywall, etc.) to dynamically adjust audio in real time. The result is audio quality that, to my ears, matches multi-thousand-dollar loudspeakers. Apple’s visual technology is good, but the audio is a truly convincing representation of reality. When I put myself in a virtual environment, such as the top of a mountain, it truly sounds like I am there. In fact, in Apple’s mountainous environments, you can even shout and hear an echo—a whimsical touch.
Much of this functionality, of course, is achieved with artificial intelligence. Using the Vision Pro involves several, perhaps dozens, of AI models making predictions simultaneously and constantly—understanding the depth of your room, tracking the precise motion of your eyes and hands, dynamically adjusting the camera feeds to convince your brain that it’s looking at the real world. To use the Vision Pro is to engage in a close cognitive partnership with artificial intelligence just as much, if not more, than using ChatGPT. You are wearing AI on your head, next to your brain.
Social media has been replete with videos of people wearing the Vision Pro in public—driving a car, riding the subway, crossing a busy intersection—as a stunt. Perhaps the less tech-savvy expect this to become a thing soon, but I do not. This device seems purpose-built for indoor use: it is both heavy and fragile. Windows stay fixed relative to the environment; they don’t move with you, meaning anything more than a trip across your home is impractical.
This is also a deeply solo device. If a laptop is a “personal computer,” this is… a very personal computer. When I put myself in one of the Vision Pro’s built-in “environments” (see screenshot below, and note that each of those windows appeared to me to be about six foot squares when I captured this image), I feel entirely transported, but not to a physical place. Instead, it feels like a new world, one where the digital tools that have come to enable much of my life are native. No longer are digital tools confined to tiny screens; here, they have leapt out and can be made life-size, or even larger-than-life. In some important sense, this feels appropriate: Wearing the Vision Pro enables my digital tools to occupy the same size in physical space as they have for many years in my cognitive space. Some will call this a dystopian invasion, but I see it as a recognition of reality. Software has far outgrown its physical confines, and the Vision Pro simply acknowledges this fact.
Because software in the Vision Pro is spatial, I find it changing my experience of computing in unanticipated ways. My memory has always been spatial—I remember information I learned more effectively when I learn it an unfamiliar setting, even if the information and the setting aren’t related. For instance, there are otherwise forgettable podcasts that I recall well because I listened to them in a novel setting. By being associated with a different place, the memory is somehow richer. There is a wealth of literature suggesting that memory is encoded spatiotemporally by the hippocampus; perhaps the richer and multi-dimensional context enabled by spatial computing more naturally mimics the way our brains map and recall information.
It can be hard to remember all the information our phones and laptops throw at us daily: very often, it feels like a deluge of undifferentiated and context-scarce data. On the Vision Pro, I’m finding recall to be easier. Information is no longer on my phone; it’s in the room. Imagine walking into a new room for a minute or so and being asked to recall the layout of furniture, decorations, and the like, versus studying a list of 15 items for a minute and being asked to recite that. The former task is far easier for humans than the latter because the brain—or at least my brain—is inherently spatial. Something similar is happening in my early experiences with spatial computing. Perhaps it’s just a novelty that will fade away, but perhaps there is something powerful there.
This new world feels like I place I can visit—a deeply pleasant place where I have capabilities I do not have in the real world, and in which I can focus peacefully for extended periods. Yet it is also a heavy place, in a metaphysical and literal sense: after all, you’re wearing a pound of aluminum, glass, magnesium, and silicon on your head. In that sense, it feels more separate from the real world than an iPhone or a laptop. In fact, I find myself wanting to minimize my use of those devices: When I want to live in the digital world, I’ll live in the digital world. And when I want to live in the real world, I’ll live in the real world. Steve Jobs likened computers to a “bicycle for the mind,” but for many of us they have come to resemble a pack of cigarettes. The Vision Pro feels closer to a bicycle than any hardware product I’ve used in recent memory.
One day, this headset will be a pair of nondescript glasses, or perhaps even contact lenses. By then, the merge will be much further along—and that’s likely to be a while from now. For now, I like the sharp distinction the Vision Pro creates. It makes me appreciate the digital and physical worlds for what they are, with their unique advantages and disadvantages. It makes me want to put away my phone and turn off my TV: that stuff is for in there, not out here. Out here, there is so much else to do: Why use a poor substitute of your digital world when you could be enjoying the real world? Now that the digital world is a place, it feels like somewhere I can choose to go.
Will some people decide that the digital world is the only place they wish to be? Without a doubt. But in a way, I think the constraints of the Vision Pro are what make it such an interesting product. It allows you to immerse yourself in digital life, but at a cost of convenience, comfort, and of course, for now, a steep price. As Apple and their competitors iterate on this product category—and iterate they will—the tradeoff will become easier to make, and the distinction between digital and physical will gradually fade away. I do not pretend to know where that will lead us. Perhaps we will be able to share our digital environments with one another, enabling new forms of social bonding. Or perhaps it will remain a deeply isolating experience.
Nothing, however, is fated: there is no logic of technological destiny that predetermines any of this. The decisions about how to grow this powerful technology will be made by human beings—engineers and designers at Apple, Meta, and other companies; entrepreneurs developing new digital experiences; and each of us, as individual consumers, as parents, and as friends. I hope we make our decisions more conscientiously than we did with smartphones, social media, and the like. The stakes feel higher than they did a decade ago; they likely are.