Achieving Expert-Level Agent from Foundation Model via Complexity Curriculum Reinforcement Learning with Synthetic Data

Achieving Expert-Level Agent from Foundation Model via Complexity Curriculum Reinforcement Learning with Synthetic Data

ICLR 2026 Conference Submission22487 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Reinforcement Learning, Geometry Agent Prover

TL;DR: We propose Complexity Curriculum Reinforcement Learning to train LLMs to solve IMO-level geometry with minimal data, surpassing gold medalists and showing emergent creativity.

Abstract: Large Language Model (LLM)-based agents exhibit strong mathematical problem-solving ability and can even solve International Mathematic Olympiad (IMO)-level problems with the assistance of a formal language prover. However, hindered by the weak heuristics of auxiliary constructions, the AI for solving geometry problems remains to be specialist models such as AlphaGeometry2, which heavily relies on large-scale data synthesis and search for training and testing. Therefore, this paper makes the first attempt to investigate how to build a medalist-level LLM agent for solving geometry problems and eventually proposes InternGeometry. InternGeometry conquers the weak heuristics of geometry problems by continuously proposing propositions and auxiliary configurations, verifying them in the symbolic engine, and reflecting on the feedback from the symbolic engine for the next proposal, where the dynamic memory mechanism allows InternGeometry to conduct model-symbolic engine interactions more than two hundred times. To further accelerate the learning process of InternGeometry, we introduce Complexity-Boosted Reinforcement Learning (CBRL) that gradually scales the complexity of the synthesized problem at different training stages. Based on InternThinker-32B, InternGeometry solves 44 of 50 IMO geometry problems (2000–2024), exceeding the average gold medalist score (40.9), using 13K training examples, only 0.004\% of the data used by AlphaGeometry2, demonstrating the potential of LLM agents on expert-level tasks. InternGeometry is also capable of proposing novel auxiliary constructions on IMO problems that are unseen in human solutions. Model, data, and symbolic engine will be released to benefit future research.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 22487

Loading