AI math genius delivers 100% accurate results


AlphaProof learning and adaptation process. Credit: Nature (2025). DOI: 10.1038/s41586-025-09833-y
At the 2024 International Mathematics Olympiad (IMO), one competitor did so well that it would have received the silver prize, with one exception: it was an AI system. This was the first time IA achieved a medal-level performance in the history of the competition. In an article published in the journal Natureresearchers detail the technology behind this remarkable achievement.
The AI is AlphaProof, a sophisticated program developed by Google DeepMind that learns to solve complex math problems. The achievement IMO was pretty impressive, but what makes AlphaProof truly special is its ability to find and fix errors. Although large language models (LLMs) can solve mathematical problems, they often cannot guarantee the correctness of their solutions. There may be hidden flaws in their reasoning.
AlphaProof is different because its answers are always 100% correct. This is because it uses a specialized software environment called Lean (originally developed by Microsoft Research) which acts as a strict teacher checking every logical step. This means that the computer checks the answers itself and its conclusions are therefore reliable.
Three-step training process
Training this powerful system to reason at an elite level involved three different training stages. First, researchers exposed AlphaProof to approximately 300 billion tokens of general code and mathematical text to give it a broad understanding of concepts such as logic, mathematical language, and programming structure. Then he received 300,000 math tests written by experts who were already in the Lean environment.
The last stage was where the system learned to solve problems on its own. He was given a massive assignment of 80 million formal math problems to solve. Using reinforcement learning (RL), based on trial and error, AlphaProof was rewarded for each successful proof. By tackling mathematical problems on such a scale, the system taught itself new and complex reasoning strategies that went beyond copying human examples.
For the most difficult problems, AlphaProof used a technique developed by the researchers called Test-Time RL (TTRL), which creates and solves millions of simplified versions of the target problem until it finds a solution.
“Our work demonstrates that large-scale learning from concrete experience produces agents with complex mathematical reasoning strategies, paving the way for a reliable AI tool for solving complex mathematical problems,” the researchers wrote in their paper.
In addition to solving seemingly unsolvable mathematical problems, AlphaProof could also be used by mathematicians to correct their work and help them develop new theories.
Written for you by our author Paul Arnold, edited by Gaby Clark, and fact-checked and revised by Robert Egan, this article is the result of painstaking human work. We rely on readers like you to keep independent science journalism alive. If this reporting interests you, consider making a donation (especially monthly). You will get a without advertising account as a thank you.
More information:
Thomas Hubert et al, Olympiad-level formal mathematical reasoning with reinforcement learning, Nature (2025). DOI: 10.1038/s41586-025-09833-y
© 2025 Science X Network
Quote: AI Math Genius Delivers 100% Accurate Results (November 14, 2025) Retrieved November 14, 2025 from https://phys.org/news/2025-11-ai-math-genius-accurate-results.html
This document is subject to copyright. Except for fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.

