What are the key points?

Anthropic warns of recursive self-improvement risks in a new 10,000-word technical paper Claude model now authors over 80% of Anthropic's codebase as of May 2026 Company advocates for a coordinated, verifiable pause on frontier AI development efforts

Anthropic Warns of Risks From Recursive AI Self-Improvement

•Anthropic warns of recursive self-improvement risks in a new 10,000-word technical paper
•Claude model now authors over 80% of Anthropic's codebase as of May 2026
•Company advocates for a coordinated, verifiable pause on frontier AI development efforts

Anthropic has released a new 10,000-word paper titled "When AI builds itself," shifting its public messaging from the economic impact of job displacement to the risks of autonomous AI systems. While CEO Dario Amodei has frequently warned that AI could lead to 10% to 20% unemployment within five years, particularly in coding, finance, and law, the new analysis focuses on "recursive self-improvement." This refers to an AI system becoming capable of designing and training its own successor with minimal human intervention. While Anthropic acknowledges this capability has not yet been achieved, it argues the technical gap is closing rapidly.

Internal data shared in the paper highlights the accelerated pace of AI-assisted development. As of May 2026, over 80% of code merged into Anthropic's codebase was authored by its Claude model, a significant increase from the low single digits recorded before the February 2025 launch of Claude Code. Anthropic engineers are now shipping 8x more code per quarter than they did between 2021 and 2025. Additionally, in a March 2026 internal poll, researchers reported that the internal Mythos Preview model increased their output by approximately 4x compared to working without AI. One specific case from April 2026 demonstrated Claude shipping over 800 fixes that reduced a class of API errors by a factor of 1,000, a task estimated to require four years of human labor.

The paper warns that as AI systems gain the ability to build successors, the challenge of alignment shifts from a research problem to a survival risk, where human control could be permanently lost. The external evaluator METR reported that the Mythos Preview model can now operate independently for at least 16 hours. Furthermore, the length of tasks the model can complete reliably has been doubling every four months. In response to these findings, Anthropic is advocating for a coordinated, verifiable pause on frontier AI development. The company plans to convene policymakers and other AI laboratories through the Anthropic Institute to build the necessary verification systems, noting that while such an agreement is difficult to enforce, the urgency of managing these powerful systems is increasing.

Anthropic has released a new 10,000-word paper titled "When AI builds itself," shifting its public messaging from the economic impact of job displacement to the risks of autonomous AI systems. While CEO Dario Amodei has frequently warned that AI could lead to 10% to 20% unemployment within five years, particularly in coding, finance, and law, the new analysis focuses on "recursive self-improvement." This refers to an AI system becoming capable of designing and training its own successor with minimal human intervention. While Anthropic acknowledges this capability has not yet been achieved, it argues the technical gap is closing rapidly.

Internal data shared in the paper highlights the accelerated pace of AI-assisted development. As of May 2026, over 80% of code merged into Anthropic's codebase was authored by its Claude model, a significant increase from the low single digits recorded before the February 2025 launch of Claude Code. Anthropic engineers are now shipping 8x more code per quarter than they did between 2021 and 2025. Additionally, in a March 2026 internal poll, researchers reported that the internal Mythos Preview model increased their output by approximately 4x compared to working without AI. One specific case from April 2026 demonstrated Claude shipping over 800 fixes that reduced a class of API errors by a factor of 1,000, a task estimated to require four years of human labor.

The paper warns that as AI systems gain the ability to build successors, the challenge of alignment shifts from a research problem to a survival risk, where human control could be permanently lost. The external evaluator METR reported that the Mythos Preview model can now operate independently for at least 16 hours. Furthermore, the length of tasks the model can complete reliably has been doubling every four months. In response to these findings, Anthropic is advocating for a coordinated, verifiable pause on frontier AI development. The company plans to convene policymakers and other AI laboratories through the Anthropic Institute to build the necessary verification systems, noting that while such an agreement is difficult to enforce, the urgency of managing these powerful systems is increasing.