Where Do You Stand?

On ghosts

Dec 12, 2025

A few days ago I bade farewell to an old friend.

That’s how it felt, at least. The mechanics were decidedly less dramatic than the feeling: all I did was drag a long-residing application from my macOS dock and watch its application icon poof, in signature Mac style, into nothingness.

The app is called “BBEdit.” For those steeped in the history of independent software development on the Mac, this name will be a legend. It is a dead-simple text editor—the “BB” in the name references the company that makes the app: “Bare Bones Software.” Primarily intended for coding—but outstanding for writing too, especially if you write, as I often do, in Markdown—BBEdit is perhaps the canonical example of what is sometimes called a ”Mac-assed Mac app.”

Do not let its austere interface confuse you: this is an app of unbelievably rich functionality and pristine engineering. Famously, BBEdit is built to handle files whose size would bring Microsoft Word, Apple Pages, and any other app made by the trillionaire companies to a screeching halt.

I started using BBEdit around 2003, when I first learned to write HTML, PHP, Perl, and, eventually, C. Even back then, BBEdit was a “classic,” having shipped its first version in 1993. In high school I wrote papers and love letters in BBEdit; in college I wrote my theses. I wrote my father’s eulogy using this tool. Writing is just crystallized thinking, so in many ways, I learned to think with BBEdit as my instrument.

I love this app the way anyone loves their most trusted and longest-lasting tool. Nothing will ever quite match it. I tried, and ultimately discarded, generation after generation of supposedly “more modern” text editors and coding apps, including the early generations of AI coding apps. I stuck fiercely to my old ways. But now, at least for me, BBEdit has reached the end of its useful life.

In its place are a suite of new tools which, together, constitute the most exhilarating revolution in digital technology I have ever seen. Yet I cannot help but see the symbolism that an app built in almost the same year that I was born has become outmoded by artificial intelligence. I try to look eagerly toward the new, to dream of the almost-possible. Yet I could not help but wince when I saw the BBEdit icon disappear so frictionlessly. An era of technology, a set of skills, and an approach to the world, slipped away at the drag of a cursor.

I fight between my inner conservative—the lover of the familiar for familiarity’s sake—and my inner techno-accelerationist, who impatiently desires ceaseless change. But ultimately, I let BBEdit fade away. This, in the end, is where I stand.

—

My approach to “prompting” LLMs is stupidly simple: I speak to them as though they are sophisticated, knowledgeable, and capable colleagues. I do not “prompt engineer,” and always suspected that this skill would grow irrelevant quickly. Nor do I practice any other tricks intended to eke out additional performance from AI models. I have continually assumed those skills will be obviated by near-future models, which will do much better on hard questions. Even more broadly, I’ve never found it that useful to break down software engineering tasks into a large number of smaller tasks that models can do. I just want a software engineer in the cloud.

Instead, I simply ask for answers or for work to be done as I would to an omnicapable colleague. I do often ask extremely detailed questions, however, and sometimes I have seen people call this “prompt engineering.” I don’t find it helpful to think of it that way. “Prompt engineering” is a set of workarounds for today’s models, but the next models make those workarounds obsolete. The deeper and, in my view, better, bet is simpler: the models will keep getting smarter, so you should just ask them for what you really want.

What this approach means is that I often “leave money on the table”—I don’t usually learn the skills you need to get the current models to do their very best. But what this also means is that I am prone to immediately recognizing when models have crossed a qualitative “capability threshold.” Some new models will, quite suddenly, be able to perform tasks their predecessors simply could not competently or reliably do just the day prior. I notice these transitions quickly; they slap me in the face.

There have been a few such transitions in the last year or so. The initial reasoning model from OpenAI, o1-preview, could clearly answer complex questions requiring analysis of several or more sub-questions followed by synthesis in a way earlier models could not. OpenAI’s Deep Research and o3 marked another major transition just a few months later because of their ability to extensively search the web; they became full-blown junior research assistants.

In the last few weeks, I believe we have crossed another threshold, and this one may well be the most profound of the ones I have described. We have created digital junior software engineers, capable of reliable, end-to-end autonomous execution of reasonably complex software projects. Put a bit more simply, we have created digital agents capable of using computers to do a large fraction of the tasks that can be done purely using a computer.

The best way to experience this for the first time is with coding agents in a command line interface. For those unfamiliar: a command line is the textual computing interface that preceded the graphical user interface (windows, file icons, and the like). For the first several decades of computing, command-line interfaces were the only way to use a computer.

Because of their utility and efficiency, every modern operating system retains a command-line interface (CLI). And because developers, system administrators, and other technically savvy professionals are the most common users, CLIs have remained in modern digital life. A user well versed in their operating system’s CLI can do many of the same things you’d use a graphical interface to do, sometimes with far greater versatility and efficiency.

In a way, the CLI foreshadowed the modern LLM “chatbot” interface by more than a half century (the first CLIs date to the 1960s): the computer not as a gallery of candy-coated app icons and flashy (some would even say addictive) UI elements, but as an empty box and a blinking cursor, awaiting instructions. In retrospect, then, perhaps it is not surprising that it the ancient command line has ended up being such a superb form factor for the modern LLM.

Rather than having to teach AI systems to navigate graphical user interfaces, with all their affordances for human vision, ergonomics, and foibles—the command line allows an agent to operate on its home turf: in the domain of pure language. These agents can read, edit, and create files on your computer, execute scripts and applications, retrieve files from the web, and many other tasks that chatbots simply cannot do.

Like a chatbot, you can type to a CLI agent in natural language. Unlike a chatbot, though, these agents can do far more than merely retrieve information: they take something approximating the full range of actions that a software engineer could take were they sitting at your keyboard. They can try to do things, encounter errors, troubleshoot, and write or rewrite software to get around those errors. You type some words, and you watch a computer in the cloud—Claude, GPT-5.1 (and now 5.2), Gemini 3—use the computer sitting in front of you. “Ghosts,” Karpathy has called them.

A full range of computer-hygiene utilities to scan for security vulnerabilities, wasted storage, and overall system health? Easy. A recreation of some corporate LLM research to test a novel hypothesis? Doable in an hour or so, with a full report and a nice-looking microsite. An interactive simulation of a lake and river system to give me a better intuition for hydrology? About twenty minutes. Near-autonomous retrieval and analysis of economic datasets I care about? Half an hour, and most of this was me verifying that the model’s scripts worked. A machine-learning-enabled baby monitoring system, capable of determining different kinds of newborn cries and alerting me to them? Half an hour again, though I will need to wait until my son is born to verify this one. A shockingly convincing—and playable!—recreation of Minecraft? About fifteen to twenty-five minutes.

I cannot quite put into words what it is like to enter a flow state with these tools at my hands. As a near-lifelong computing aficionado, the last year alone has brought the most significant changes I have ever witnessed to how I conduct my personal and professional affairs in the digital world. And I know that in the grand scheme, I am an old man using this technology. Children born today, raised with these capabilities (and more) as table stakes, will do things that confound, shock, amaze, and ultimately, in the aggregate, enrich us all.

—

One word I have used to describe what the mechanization of intelligence will feel like is conscientiousness. What I mean by this is not that machines themselves will be conscientious (this will be a matter of opinion and circumstance), but instead that the world will contain vastly more products of what would have before required conscientious human thought and effort.

Many recent models have wowed me in various ways, but none more so than Anthropic’s Claude Opus 4.5. This may well be the single best language model ever made, combining coding prowess, intellectual depth, and exceptional writing. But above all else it is conscientious.

I have been thinking recently, for both personal and professional reasons, about kids online safety and the problem of conscientiousness. I know that it is possible for a child to spend substantial amounts of constructive, positive time on computers; my own childhood, in the admittedly very different early-2000s digital ecosystem, is a testament to this. I also know it is possible, and perhaps it is much more common today, for children to fall into compulsive and addictive traps.

It is usually possible to tell these two modes of child-computer interaction apart when they are presented side by side, or when a parent is observing their child. But a parent cannot spend their day literally supervising their child on the computer. Even if they could, the act of parental supervision itself changes the experience of the child; there is no way I’d have been able to learn coding with my mother watching over my shoulder the entire time. Existing parental controls, however, rely on rigid rules, and no set of rules can capture full set of judgment calls required to render a particular activity “productive” or “non-productive.” Is watching YouTube unproductive? It very much depends upon what you are watching and why you are watching it.

After hours of work with Opus 4.5, I believe we are already past the point where I would trust a frontier model to serve as my child’s “digital nanny.” The model could take as input a child’s screen activity while also running in an on-device app. It could intervene to guide children away from activities deemed “unhealthy” by their parents, closing the offending browser tab or app if need be. It could offer parents daily or weekly reports on their children’s activity, working with the parents over time to refine their definitions of “unhealthy” or “unproductive” computer use. It could enforce strict time limits, always guiding children toward enriching activities.

Would I trust it blindly? Of course not; I’d need ample tools for oversight. Would I gradually experiment with it instead of diving in altogether? Absolutely, as most any parent would. Would I want it to replace time I or my wife spend with our son? Obviously not. But the functional point here is the most important: such a service, well implemented, would allow me, via an agent, to amplify the amount of conscientious activity in my family life.

It seems like it could be possible now, and that it could plausibly be almost wholly positive. And yet I cannot deny that there is something strange about it, something a little off about contemplating life among the ghosts.

What would it mean for an AI to know your child in this way—in some sense, a way that you, as the parent, never see, perhaps should not see, for the sake of your own child’s development? Or is supervision by AI not that different from supervision by a parent? What does it mean for a child to be alone, in the physical sense, but supervised? Are any of our prior intuitions about “parental” supervision actually helping us here? Are my techno-optimist instincts entirely wrong, in this case? Or is history—the thing conservatives love to rely upon and which overwhelmingly supports a techno-optimist disposition—the better guide?

In this thing called “AI policy,” I often worry that we dodge the hard-but-uncomfortable questions in favor of the controversial-but-boring ones.

Regardless, I suspect the ghosts will write the code they will need to integrate themselves into software and hardware systems of all kinds. They will monitor—and in some cases, no doubt, actuate—industrial machinery, either themselves or via subordinate machine learning-based control systems that the ghosts will help to engineer. They will have hooks into your phone, your car, and your home—not necessarily to “control” those things but to add more conscientiousness to your experience of them. Perhaps they will even help you raise your children.

—

A senior frontier lab employee asked me recently to name my favorite technological analogy for AI. I thought through all the familiar ones—electricity, internal combustion, the printing press. I like all of those for different reasons, but I concluded that the best answer was writing. It was when we learned to write words down that we gained the ability to crystallize knowledge. Knowledge could be shared with others and preserved for the future.

No complex intellectual endeavor of any kind would have been possible to sustain without writing. No base of collective knowledge could be built. The printing press made written knowledge cheaper to copy and spread, and this itself ripped apart many sacred institutions. But that knowledge, no matter how widespread, remained inert, still waiting for a human mind to animate it.

These ghosts are animations of mankind’s collective knowledge, instantiated as infinitely replicable computer programs that can reason, act, and build tools of their own. Our knowledge itself is becoming an actor on the world-historical stage. Perhaps this has been inevitable since writing first caught on. I chuckle sometimes and wonder whether Karpathy thought of the German translation of his metaphor: geist. “Spirit,” is how it’s often translated—like weltgeist. “World-spirit.”

How, and whether, you engage with these machines is your decision. But you would do well to pay attention. These ghosts get smarter almost every month.

There will be all sorts of ways we have to impede these ghosts. Some of them we should probably want—friction can be healthy—but most we will want to avoid.

In the fullness of time, these ghosts, working variously with some of us and against some of us, will overturn the present order of the ages. What comes in its place is our collective decision. Whatever you do, do not listen to the lullabies and do not sterilize this moment in history with cynicism or dullness. Measure twice, cut once. Know where you stand.

Handle

Dec 12

So much great passion and energy went into this essay, it seemed you really put your, well, full spirit into it. As for "prompt engineering" the time has come to replace it with the same terms we use to recognize excellence in the art and skill of precisely and unambiguously communicating to other humans and efficiently articulating to them exactly what one wants done, which requires exceptional coherence and clarity that is both conceptual and verbal. In the military this is when the process of order composition is acheived at its highest level (consider Napoleon's iterative process), but at a more abstract level it is human-to- system "project elocution". If you could measure that human skill of conceptual-verbal drafting craftsmanship on a scalar, it will correlate with who pulls ahead in the near future. Literally skill at wielding a new kind of pen in the old "the pen is mightier than the sword" idiom.

André Gualtieri

I share this balanced techno-optimistic outlook of yours, combined with the prudence of a conservatism that values tradition. I think this is a strong combination for anyone who wants to offer their children the best opportunities of today’s world without letting the negative aspects of addiction and superficiality dominate their relationship with technology. I see this as one of the greatest responsibilities of those who seek to exercise parenthood in our times.

6 more comments...

Hyperdimensional

Discussion about this post

Ready for more?