A machine just closed nine mathematical problems that had been left open, in some cases, for decades. Google DeepMind's AlphaProof Nexus, an AI system that fuses large language models with the Lean formal proof assistant, autonomously solved 9 out of 353 open Erdős problems and proved 44 out of 492 open conjectures from the Online Encyclopedia of Integer Sequences. The cost per problem: a few hundred dollars.
The results, documented in an arXiv preprint published May 21, mark a meaningful escalation in what AI systems can do with mathematical research. Paul Erdős was one of the twentieth century's most prolific mathematicians, posing hundreds of unsolved conjectures across combinatorics, number theory, and graph theory. His problems have become a kind of de facto benchmark for certain corners of discrete mathematics. The website erdosproblems.com, maintained by mathematician Thomas Bloom since 2023, catalogs 1,179 of them. About 41 percent have been classified as solved.
How the System Works
AlphaProof Nexus addresses a persistent flaw in AI reasoning: hallucination. Language models can generate plausible-sounding proofs that are, mathematically, gibberish. Nexus sidesteps this by pairing generative capability with formal verification. The AI proposes a proof, then a separate system built on the Lean proof assistant checks every logical step. If the proof fails, it's rejected. If it passes, it's guaranteed correct.
This dual approach runs in what DeepMind calls "agentic loops." The system iterates, refines, and resubmits proofs until Lean accepts them or the agent concludes the problem is beyond its current capability. Problems tackled include variants #125, #138, #741, and #12 from the Erdős catalog, with proofs shared publicly via GitHub and discussed on erdosproblems.com forums.
From Olympiad to Open Research
AlphaProof Nexus is an evolution of the original AlphaProof, which made headlines in 2024 when it achieved silver-medal performance at the International Mathematical Olympiad. That system solved four out of six IMO problems, earning 28 out of 42 possible points. One of those was problem six, the competition's most difficult, solved that year by only five of 609 human contestants.
The jump from Olympiad to research-level mathematics is significant. Competition problems are designed to be solvable within hours by talented students. Open research problems carry no such guarantee. Many have persisted for decades precisely because no one has found an approach that works. A basic agent variant of Nexus also solved nine Erdős problems but at higher computational cost, suggesting the full architecture is more efficient, not necessarily more capable in raw terms.
Context and Caveats
There's recent history worth noting. In October 2025, OpenAI executives claimed GPT-5 had solved ten Erdős problems. Bloom clarified what actually happened: the model conducted literature searches and surfaced existing published solutions that he personally hadn't cataloged. The problems were "open" only in the sense that the database hadn't recorded the prior work. The solutions weren't novel. The resulting criticism from mathematicians, including Meta's Yann LeCun and DeepMind's Demis Hassabis, was pointed.
The AlphaProof Nexus results appear to be different. The formal verification pipeline means the proofs can be independently checked, and the GitHub release includes both formal Lean code and natural-language interpretations. Still, the mathematical community has learned to be cautious. Earlier this year, DeepMind's own Aletheia agent initially claimed nine novel autonomous solutions to Erdős problems, a number that shrank to four after external review revealed issues of "disentangling literature rather than issues of mathematical correctness." Future efforts, the researchers noted, will need to be cautious in this regard.
What It Means
Nine problems out of 353 open ones is roughly 2.5 percent. That's not a revolution. But the economics are striking. For a few hundred dollars per problem, an AI system can now attempt research mathematics at a level that, until recently, was exclusively human. The implications extend beyond pure math. Formal verification has applications in cryptography, smart contract auditing, and protocol design. DeepMind has built the system for mathematical research, with anticipated uses in combinatorics, algebraic geometry, and optimization. But the underlying technique, AI-driven formal verification, touches fields where correctness is non-negotiable.
Terence Tao, the Fields Medalist who has collaborated with DeepMind on related work, has noted that tools like these allow mathematicians to test inequalities for counterexamples and sharpen intuition. The system doesn't replace human insight. But it does shift the economics of mathematical exploration in ways that will take years to fully understand.


