Wu’s Method Boosts Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

Published: 10 Oct 2024, Last Modified: 31 Oct 2024MATH-AI 24EveryoneRevisionsBibTeXCC BY 4.0
Keywords: automated theorem proving, geometry, IMO, symbolic
TL;DR: Combining Wu's method with synthetic techniques solves 21/30 IMO geometry problems, yielding a symbolic baseline rivaling silver medalists. Ensembling with Alphageometry establishes first AI method to solve 27 problems and outperform gold medalists.
Abstract: Proving geometric theorems constitutes a hallmark of reasoning combining intuitive, visual, and logical skills, making automated theorem proving of Olympiad-level geometry problems a milestone for human-level automated reasoning. AlphaGeometry, a neuro-symbolic model trained with 100M synthetic samples, solved 25 of 30 International Mathematical Olympiad (IMO) problems. It marked a major breakthrough compared to the reported baseline using Wu's method which solved only 10. Revisiting the IMO-AG-30 benchmark, we find that Wu's method is surprisingly strong and solves 15 problems, including some unsolved by other methods. This leads to two key findings: (i) Combining Wu's method with the classic synthetic methods of deductive databases and angle, ratio & distance chasing solves 21 out of 30 problems on a CPU-only laptop limited to 5 minutes per problem. Essentially, this classic method solves just 4 fewer problems than AlphaGeometry and establishes the first *fully symbolic* baseline that rivals the performance of IMO silver medalists. (ii) Wu's method even solves 2 of the 5 problems that AlphaGeometry failed on. Combining both, we set a new state-of-the-art for automated theorem proving on IMO-AG-30 solving 27 out of 30 problems - the first AI method which outperforms an IMO gold medalist.
Concurrent Submissions: N/A
Submission Number: 47
Loading