20 Comments
User's avatar
Michelle Ma's avatar

I agree that regulation's path-dependence mandates caution & adaptive rules. But doesn't conservatism also lock us into a potentially costly path--namely, inaction during a critical window?

Sure, waiting for AI's trajectory to unfold gives us greater certainty about prudent policy actions. But in the world where your predictions materialize, you also talk about how AI policy becomes coarser, more populist, & more entangled with corporate interests. Isn't there a tradeoff between greater certainty about which policies are prudent & diminishing certainty about whether these policies can pass at all? Can good foundations alone counter the explosion of competing political interests as AI's impacts intensify?

I wrote a bit about this tradeoff here: https://bullishlemon.substack.com/p/ai-forecastings-catch-22-the-practical

Expand full comment
Dean W. Ball's avatar

I do think this is a trade off, but consider what hasty policy advocacy has done for the cause of ai safety:

1. It has made a large share of industry figures and others believe that ai safety was opposed to open source, which polarized things early. Many ai safety advocates really did oppose open source, and this was a very heavy price to pay.

2. It caused many people to view ai safety advocates as extremists.

3. The emphasis on the community on extremely short timelines created predictions that are very easy to falsify and rebut. “Ai isn’t going to imminently go into a recursive loop to superintelligence” is a pretty easy argument to make.

All of this and more has resulted in ai safety being discredited, and probably caused many policymakers (who will naturally gravitate toward a synthesis of different views) to dismiss near term transformative ai. Even today, many people in dc polite society feel compelled to begin their statements of position as “I don’t believe in the crazy agi stuff, BUT…” followed by some lukewarm policy takes.

So I agree with you there’s a tradeoff here, and neither side of the dichotomy you’ve pointed out is “right.” I suspect, however, that incrementalism is our only real option given political constraints.

Expand full comment
Keller Scholl's avatar

Agreed with all conclusions.

"Historians looking back on the period between 2025 and 2035 are likely to describe it as a renaissance. And it will be. That does not necessarily mean it will be an enjoyable experience for most of the people who live through it." People argue about the exact dates of the Italian Renaissance, but I think "not an enjoyable experience" describes enough of it.

Expand full comment
Freddie deBoer's avatar

What is the specific, scientifically and technologically plausible process through which next-token predictors become superintelligent? What is the specific, scientifically and technologically plausible process through which next-token predictors become recursively self-improving? How do systems that decide on next individual tokens (and do not even know what the next token in a string will be prior to generating it) based on statistical patterns derived from large learning sets become "AGI"? Why are we to suppose that systems that simply identify text strings that are the most algorithmically likely to be perceived to satisfy a user's prompt have ability whatsoever to meaningfully improve? Why is the actual process through which "the Singularity" will happen no more detailed, robust, or evidence-based in our current analysis than it is in science fiction? Why is all of this so incredibly vague on what is the only meaningful question, HOW a Chinese room that generates tokens it doesn't understand based on input strings it doesn't understand supposedly becomes superintelligent? And why do people think the burden of proof is on skeptics when no one can answer any of these questions with any degree of specificity, technological certainty, or empirical evidence?

Expand full comment
Dean W. Ball's avatar

The way it happens is that you use the next token predictor in reinforcement learning environments to maximize expected reward on diverse sets of problems, thereby honing skills in novel domains—especially domains where verification is easy. This is what has been underway for a year or more, and it has yielded models that have discovered new mathematics and outperformed humans in coding competitions. Indeed, it is debatable whether it is really appropriate to call the current reasoning models “next token predictors” any longer, though also of course you underrate, likely because you do not know or understand, how sophisticated SGD can be.

The next stages you’d need for AGI, as I see it at least, are identified in the first section of this essay.

I didn’t say anything about recursive self-improvement, nor, in fact, about superintelligence, though it is an empirical fact that AI is already automating parts of AI research and engineering.

“Understand” is a word I’ll let the pedants debate. To some it is understanding, to others it is not, to me it is irrelevant compared to the observed capabilities of the systems.

In general, I think one of those next token predictors would understand my essay better than you did, understands AI better than you do, and would have written a better critical comment than you did. I think you should take a nap, and then reflect on these facts, and ponder whether you are ready for the next decade. I think you’re in for some surprises. Have a nice day!

Expand full comment
Freddie deBoer's avatar

1. This is "then, a chemistry occurs" length analysis that's backstopped by bluster and personal insult

2. The presence of the personal insult demonstrates that you are on some level aware that there is no scientifically rigorous justification for the notion that any currently-developing AI system has anything like the capacity to each "runaway growth" or whatever the current cliche is. I'm sorry, but it's science fiction.

3. Why on Earth would I trust that systems that are so terribly riddled with hallucinations and errors would be able to self-improve in a way that did not simply replicate and magnify the error-making tendency?

I'm afraid no machine god is coming to rescue you from mundanity. We're still stuck here.

Expand full comment
Dean W. Ball's avatar

You seem confused. I provided you with a fairly detailed technical explanation that you just didn’t respond to at all. And then you keep talking about “runaway growth,” which I don’t believe my essay is about. At what point do I talk about recursive self improvement, runaway growth, machine gods, etc.?

It seems like you didn’t really read my piece or my reply to you, or that if you did, you understood neither. You seem full of rage and words, and I have the impression everything is not alright with you. I’m going to stop replying now. Happy to delete this entire exchange if, upon saner reflection, you deem it bad for your reputation, since I understand you are apparently a well-known writer. Just let me know, and be well.

Expand full comment
Michelle Ma's avatar

Some theory/evidence on a specific & technologically plausible process of recursive self-improvement:

- AI has been rapidly improving at the capabilities necessary to contribute to AI R&D, such as coding (https://epoch.ai/benchmarks), making novel contributions to math research (https://x.com/mathematics_inc/status/1966194751847461309), & solving long-horizon software engineering problems (https://arxiv.org/pdf/2503.14499#cite.METR_HCAST).

- Assuming we're on trend to get an AI R&D automating agent (based on the above observations), this paper outlines & evaluates the plausibility of (one model of) an intelligence explosion (p. 28-31 in particular look at the empirical returns to software R&D): https://www.forethought.org/research/will-ai-r-and-d-automation-cause-a-software-intelligence-explosion.pdf

- A following paper evaluates how big & quick this software intelligence explosion could be (using an economic model of R&D) - their conclusions are comparatively moderate, but still entail superhuman systems relatively quickly: https://www.forethought.org/research/how-quick-and-big-would-a-software-intelligence-explosion-be.pdf

Of course it's possible to dispute/critique this work, but it *is* out there.

Expand full comment
Dean W. Ball's avatar

I suspect rsi is in some sense already happening but won’t ever happen in the imagined way because of compute constraints, but that’s only a lightly held view.

Expand full comment
Dean W. Ball's avatar

Either way it annoys me how people with withering world views, nothing else to say other than hate and cynicism, like Freddie, rely on rsi and extremely short timelines as their strawman. So I try not to rely on such things exactly to avoid room temp iq critiques like the original comment.

Expand full comment
Michelle Ma's avatar

Fair, though I do somewhat understand Freddie's frustration as AI safety discourse *can* tend to advance glib assumptions and confused analogies to support rsi, short timelines, & god-like superintelligence (and dismissiveness towards criticism--something frustrating I've encountered a lot is "ok but if it's smart enough that won't matter" in response to any AI progress bottleneck).

But I don't think you did this in your post & I respect not relying on rsi. I do think rsi arguments generally should be publicly supported by more rigorous analyses (I suspect most AI skeptics haven't seen the Forethought papers)

Expand full comment
Michelle Ma's avatar

Mind elaborating? The linked papers assume fixed compute (though they do rely on some assumptions about R&D experiment size)

Expand full comment
Freddie deBoer's avatar

I have it on good authority that there are also several pre-existing books and texts that detail why we will not die but shall live forever in Paradise with our long lost loved ones for an eternity in glorified bodies, and those texts and the practices they've spawned exist for the exact same psychological reasons as your need to believe that some superintelligent force will soon emerge to save you from everything you don't like about your life.

Expand full comment
Michelle Ma's avatar

I do not believe that & have my own reservations about superintelligence (and certainly short timelines), but let's not use psychologizing to argue that the Bible & the papers I linked are on the same epistemic level. My own AI governance research involved reading long regulatory texts & public comment records and making line-by-line critiques of state-level legislation, so let's not assume that I'm in need of saving from immediate mundanity eiither. In fact, my intuition is rather aligned with Amara's Law: "We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run."

I generally find that psychologizing is not a productive discursive method unless done with understanding, tact, & an existing rapport.

Expand full comment
Dean W. Ball's avatar

To reply re: compute constraints, my assumption is that it would not be possible to roll out even a modest number of agents for medium-long horizon experiments (which themselves also require compute) without fairly serious additional marginal compute.

I do think ai-assisted automation already is speeding up the labs (in some sense it has been for years), and that this boost will get larger over time. It will probably be noticeable to an outsider at some point, perhaps soon.

But I expect: a)human bottlenecks (need for oversight, approval of experimental compute use etc) and b)the compute intensity, and c)these initial agents will not be “automated geniuses,” but instead “highly competent automated mundane research executors”. The result of these three things will be that the main effect is to accelerate our way up the many s curves that make up ai progress, and indeed a great deal of technological progress.

Expand full comment
Mike F's avatar

This is very thoughtful and interesting. However your back of the hand to creators is frustrating and misguided. Creators are NOT in the way of AI, but they have a desire to control their work. The tech companies need these suppliers, but are simply taking the content without respecting their rights. Just as tech companies can pay their employees, their compute costs, and for power, they can pay content creators. They just have a problem to solve - which is how to do it and allocate the money and respect the permissions. You make it sound like creativity, journalism, commentary, innovation is now obsolete and we should just let the tech companies decide. But self governance requires independent sources of information, it requires humans (or humans guiding AI) to be there where the news is happening, to interrogate those in power, to report what's online and what's not. It is content that people come to search for and what they inquire of their AIs to find. It is actually in HIGH DEMAND, not obsolete. The old deal - give us content to search and we will provide traffic - is disappearing and like clear cutting a forest, it will be very difficult to retrieve. The sooner the big tech companies recognize this the sooner the problem can be solved. There is sufficient technology and money to make it work without destroying the ecosystem and ignoring human rights.

Expand full comment
Steven Postrel's avatar

What worries me about the CA legislation you now support (to my surprise) is that if one simply substitutes "software" for "frontier model" in it there is almost no difference in purported rationale and purported safeguards and remedies. This bill appears in its very first passages to outlaw all proprietary (non-disclosed) software that uses statistical inference methods (or possibly all proprietary software in general). And it places huge disclosure burdens on all software developers while loading impractical duties upon the "Office of Emergency Services."

I haven't even "delved" into the later parts of the bill, but it gives every appearance of being the kind of CA regulatory overreach for which the state is infamous.

Expand full comment
Dean W. Ball's avatar

I would recommend that you read the entire bill. And I would reflect on the fact that no, most software does not in fact intrinsically pose novel biorisk, and frontier ai now does.

Expand full comment
Steven Postrel's avatar

I am still not clear how the below is not a Pandora's Box of classic CA innovation-stomping, full of vague lawyer-speak about "foreseeable" and "expert-level" and "substantially similar form" and "in combination with other software if the frontier model did not materially contribute to the harm." It is a trial-lawyer's and ambulance-chasing AG's and anti-tech NGO's dream text.

And while it would likely slow down progress overall, it seems tailor-made to help out existing leaders in the field (e.g. Anthropic, which very publicly addresses (c) (1) (a) with each version of Claude) while presenting insuperable barriers to small players (or individual users) who would lack the resources to check for this on each release.

I will address the purported restrictions of the domain of the law in another comment.

"(c) (1) “Catastrophic risk” means a foreseeable and material risk that a frontier developer’s development, storage, use, or deployment of a frontier model will materially contribute to the death of, or serious injury to, more than 50 people or more than one billion dollars ($1,000,000,000) in damage to, or loss of, property arising from a single incident involving a frontier model doing any of the following:

(A) Providing expert-level assistance in the creation or release of a chemical, biological, radiological, or nuclear weapon.

(B) Engaging in conduct with no meaningful human oversight, intervention, or supervision that is either a cyberattack or, if the conduct had been committed by a human, would constitute the crime of murder, assault, extortion, or theft, including theft by false pretense.

(C) Evading the control of its frontier developer or user.

(2) “Catastrophic risk” does not include a foreseeable and material risk from any of the following:

(A) Information that a frontier model outputs if the information is otherwise publicly accessible in a substantially similar form from a source other than a foundation model.

(B) Lawful activity of the federal government.

(C) Harm caused by a frontier model in combination with other software if the frontier model did not materially contribute to the harm."

Expand full comment
Steven Postrel's avatar

Below is the set of terms used to define what the bill purports to be concerned about. I again ask in all sincerity how this could be acceptable as a general regulatory framework in a world where people are allowed to deploy software without proving that it will never fail and hurt somebody.

1. Let's start with a "foundation model," defined below, which turns out to cover almost anything:

"Trained on a broad data set" means what? If I build an econometric macro-forecasting multiple regression model drawing on diverse data sets (government accounts, meteorology, sentiment analysis, etc) then that would seem to be covered. And "trained" is a term with a highly elastic meaning, so that even a geometry tutor could be argued to have been "trained" to recognize a "broad data set" of shapes and coordinate systems.

"Designed for generality of output" includes all general-purpose productivity tools, all user-generated content spaces, and arguably all computer languages. It excludes virtually nothing.

"Adaptable to a wide range of distinctive tasks" gives no unit of breadth to distinguish broadness or distinctiveness. A model that focuses on HR could be said to accomplish a broad range of distinctive tasks, including writing job postings, evaluating resumes, conducting interviews, etc.

"(f) “Foundation model” means an artificial intelligence model that is all of the following:

(1) Trained on a broad data set.

(2) Designed for generality of output.

(3) Adaptable to a wide range of distinctive tasks.

2. "Frontier model" become the last line of defense, but it is porous and dangerously expansive:

The bill defines such a thing as "a foundation model [which is not a serious limitation of scope, see 1 above] that was trained using a quantity of computing power greater than 10^26 integer or floating-point operations."

It doesn't even explain what "trained" means. It doesn't explain what happens when a model is distilled from another. It sets a quantitative level that will likely be surpassed very quickly by average developers given the current trends in chip deployment, while conversely missing (perhaps providentially) technological progress that runs outside the current scaling paradigm for AI progress.

3. That leaves us with the main line of defense, section (j), which specifies a $500 million gross revenue floor for the developer (and its "affiliates") to qualify as "large." This isn't very encouraging either. Stanford University, for example, has had revenue greater than this number in most years, so any external "affiliate" working with them in any way on an AI model would be covered. Worse, any firm with $500 million in gross revenues from traditional activities that tries to add an LLM or diffusion model to its service stack would seem to be covered, even if it garnered zero separate revenue from the AI tool it "deployed" to its customers.

(g) “Frontier AI framework” means documented technical and organizational protocols to manage, assess, and mitigate catastrophic risks.

(h) “Frontier developer” means a person who has trained, or initiated the training of, a frontier model, with respect to which the person has used, or intends to use, at least as much computing power to train the frontier model as would meet the technical specifications found in subdivision (i).

(i) (1) “Frontier model” means a foundation model that was trained using a quantity of computing power greater than 10^26 integer or floating-point operations.

(2) The quantity of computing power described in paragraph (1) shall include computing for the original training run and for any subsequent fine-tuning, reinforcement learning, or other material modifications the developer applies to a preceding foundation model.

(j) “Large frontier developer” means a frontier developer that together with its affiliates collectively had annual gross revenues in excess of five hundred million dollars ($500,000,000) in the preceding calendar year.

(k) “Model weight” means a numerical parameter in a frontier model that is adjusted through training and that helps determine how inputs are transformed into outputs.

(l) “Property” means tangible or intangible property.

4. Beyond the excessive reach of this law, even for the targets it was ostensibly aimed at--OpenAI, Alphabet, Meta, Anthropic, Microsoft, Amazon--it seems pretty obvious that this law creates a happy hunting ground for plaintiffs' attorneys to trawl for disgruntled and possibly delusional employees who want to be well-paid heroes and martyrs for AI safety.

And then there are the anti-competitive effects: Why the requirements to disclose proprietary information to the public and to imitators? I guess one could argue that given the law's chilling effect on would-be imitators this is rough justice or something.

5. Finally, this law would seem to put the open-source AI corporate alliance (Meta, IBM, etc.) on notice that they could be held liable for modifications and deployments by unknown third parties of their given-away software. How this could possibly be a good idea is not clear to me.

So, if I may be permitted to sound like an AI after a lazy prompt, after delving into the bill, it isn't just another piece of legislation--it's a loaded gun aimed at the heart of software innovation in the U.S.

Expand full comment