When AI Starts Building AI: Anthropic Wants the World to Hit the Brakes.

When AI Starts Building AI: Anthropic Wants the World to Hit the Brakes. But Can Anyone Actually Stop?

Just days after becoming the most valuable AI startup in the world, Anthropic delivered a surprising message.

Slow down.

In a lengthy blog post titled When AI Builds Itself, Anthropic co-founder Jack Clark and Marina Favaro argued that artificial intelligence is rapidly becoming capable of contributing to its own development. If that trend continues, future AI systems may eventually design, test, and train the next generation of models with minimal human involvement.

Anthropic’s warning centers on a concept known as Recursive Self Improvement (RSI). The idea is simple but profound: once AI becomes sufficiently capable, it could accelerate its own progress, creating a feedback loop where increasingly powerful systems build even more powerful successors.

The company isn’t claiming this has already happened. What it is saying is that the possibility may arrive sooner than governments, researchers, or society are prepared for.

The timing of the warning makes it particularly noteworthy.

Just days before publishing the article, Anthropic reportedly completed a new funding round that pushed its valuation to nearly $965 billion, surpassing OpenAI. The company is also preparing for a public offering and has seen annualized revenue grow from roughly $9 billion at the end of 2025 to nearly $47 billion today.

In other words, Anthropic is issuing a call for caution at the exact moment it has the strongest commercial incentive to accelerate.

That contradiction has become part of the story.

The Industry Is Already Moving Toward Automated AI Research

Anthropic’s concerns are not emerging in a vacuum.

Across the industry, leading AI labs are actively exploring ways for AI systems to participate in research and development.

OpenAI has publicly discussed building AI systems capable of functioning as research assistants and eventually autonomous researchers. Google DeepMind has experimented with systems like AlphaEvolve, where AI proposes algorithmic improvements, evaluates them through experiments, and iteratively refines its own discoveries.

A growing ecosystem of startups is pursuing the same vision. Companies such as Recursive Superintelligence and Mirendil are explicitly focused on building AI systems that can contribute to AI research itself.

The direction is becoming increasingly clear.

AI is no longer just a tool used by researchers. It is gradually becoming a participant in the research process.

Anthropic’s Internal Data Suggests the Shift Is Already Underway

To support its argument, Anthropic revealed several internal metrics that paint a striking picture of how quickly AI-assisted development is advancing.

The first involves software engineering.

According to the company, more than 80% of the code merged into Anthropic’s production systems as of May 2026 was written by Claude. Before the launch of Claude Code in early 2025, that figure was only in the single digits. At the same time, engineers are reportedly merging eight times more code per day than they were in 2024.

Anthropic acknowledges that code volume is not a perfect measure of productivity. More code does not automatically mean more value. Even so, the trend is difficult to ignore. Engineers increasingly spend their time reviewing, directing, and validating AI-generated work rather than writing everything themselves.

The second dataset focuses on research capability.

Anthropic tested its models on a challenge involving optimization of AI training code. In 2025, Claude Opus 4 achieved approximately a threefold performance improvement. By 2026, Claude Mythos Preview achieved roughly fifty-two times acceleration under the same benchmark.

While Anthropic cautions against directly translating these numbers into real-world productivity gains, the rate of improvement is remarkable.

The third dataset may be the most dramatic.

In April 2026, Claude autonomously identified and fixed more than 800 API-related errors, reducing that category of bugs by approximately one thousand times. Engineers estimated that performing the same work manually could have required years.

The company also described experiments where multiple Claude agents collaborated on open AI safety research questions. In one case, the AI agents closed 97% of the gap between known performance bounds after roughly 800 cumulative hours of work. Human researchers working independently closed only 23% of the same gap during a week-long effort.

Most importantly, Anthropic claims AI is improving not only at execution but also at deciding what should be done next.

In retrospective evaluations, Claude Mythos Preview proposed research directions that reviewers judged superior to the paths chosen by human researchers 64% of the time.

That may be the most significant signal of all.

The challenge is no longer whether AI can perform tasks. Increasingly, it may be contributing to the decisions that shape future research itself.

Anthropic’s Real Concern Isn’t Technology. It’s Governance.

The article ultimately focuses less on capability and more on what happens if capability continues accelerating.

Anthropic outlines three potential futures.

In the first scenario, progress slows naturally while existing AI capabilities continue spreading across society.

In the second, AI dramatically increases research productivity, but humans remain responsible for setting goals and priorities.

The third scenario is the most controversial. Here, AI systems become capable of fully autonomous recursive self-improvement, creating and refining future generations of AI with little or no human oversight.

Anthropic believes the latter two possibilities could arrive faster than existing institutions can adapt.

As a result, the company proposes something that sounds increasingly radical in today’s competitive AI environment: coordinated slowing.

Specifically, Anthropic argues that any meaningful pause would require multiple leading AI companies and governments to reduce development simultaneously while also establishing mechanisms to verify that everyone is actually complying.

A unilateral slowdown would simply shift competitive advantage elsewhere.

The Biggest Question: Would Anyone Actually Stop?

This is where the debate becomes particularly contentious.

Anthropic insists that discussing these risks is consistent with its long-standing emphasis on AI safety. Critics, however, see a different story.

Some investors and policymakers argue that global verification frameworks could unintentionally strengthen the position of well-funded incumbents such as Anthropic, OpenAI, and Google while making life harder for open source projects and smaller competitors.

The concern is straightforward.

The organizations most capable of satisfying complex regulatory and compliance requirements are often the same organizations already leading the race.

As a result, safety frameworks can sometimes look suspiciously similar to competitive moats.

Even supporters of AI governance acknowledge the dilemma.

As Ethan Mollick has noted, companies like Anthropic often contain multiple competing priorities at once. Some teams focus on commercialization. Others work on pushing model capabilities forward. Still others concentrate on long-term safety concerns.

These goals are not always perfectly aligned.

That tension helps explain why Anthropic often appears to be accelerating and warning about acceleration at the same time.

The Hardest Problem May Not Be Building RSI. It May Be Preventing It.

Anthropic’s proposal ultimately depends on a powerful assumption: that competitors can collectively agree to slow down.

History offers reasons for skepticism.

Building international verification systems for nuclear arms control took decades of negotiation, trust-building, and enforcement mechanisms. AI is spreading globally at a pace that far exceeds most historical technology transitions.

Even if every major player agreed that recursive self-improvement carries significant risks, the incentives to keep moving remain enormous. Whoever continues advancing while others hesitate could inherit a decisive advantage. That reality leaves the industry facing an uncomfortable paradox.

Many researchers increasingly believe AI progress will accelerate over the coming years. Yet the more credible that prediction becomes, the harder it becomes to convince anyone to voluntarily step off the accelerator.

Anthropic’s article asks whether humanity should be preparing emergency brakes. The harder question is whether those brakes are connected to anything at all.