Queensland: Google and OpenAI have crossed a significant benchmark in artificial intelligence with AI wins at the International Mathematical Olympiad (IMO), securing gold medals. This marks the first time AI systems have matched the highest scores traditionally earned by the world’s top high-school mathematicians.
Google’s DeepMind team collaborated with the IMO to have its general-purpose Gemini Deep Think model graded by the committee, marking one of this year’s remarkable AI wins.
This model tackled five out of six problems using natural language reasoning within the official 4.5-hour limit, unlike older methods that required formal languages and extended computation. OpenAI’s team has confirmed a similar AI win with its experimental reasoning model, evaluated by external IMO medalists.

Researchers believe this achievement shows AI could soon support mathematicians in solving unsolved problems. Junehyuk Jung, a Brown University professor working with Google DeepMind, noted that being able to solve complex reasoning tasks in natural language opens the door for deeper collaboration between humans and AI.
OpenAI’s breakthrough focused on scaling up ‘test-time compute’, giving the model more time and computational power to run multiple lines of thought at once. Researcher Noam Brown described the process as very costly, although exact figures were not shared.
Of the 630 students who competed in this year’s IMO on Australia’s Sunshine Coast, 67 received gold medals, with AI wins placing the models at the same level as the top human contestants.

Google’s DeepMind CEO Demis Hassabis stated that the company respected the IMO Board’s request to publish results only after student winners were announced.
OpenAI’s researchers noted that their advanced reasoning system will not be released immediately but demonstrates clear progress towards AI wins in tasks beyond mathematics, such as physics and other complex scientific research.
This milestone illustrates how AI wins continue to push the boundaries of what machines can do, bridging the gap between artificial and human intelligence.

