Direct Induction Proof Challenge: Evaluating Large Language Models on Deeply Nested Mathematical Induction

Published: 09 Jul 2025, Last Modified: 26 Jul 2025AI4Math@ICML25 PosterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
Keywords: mathematical induction, large language model, proof assistant, automated theorem proving, informal proof
Abstract: We introduce a challenge designed to evaluate the capability of Large Language Models (LLMs) in performing mathematical induction proofs, with a particular focus on nested induction. Our task requires models to construct direct induction proofs in both formal and informal settings, without relying on any preexisting lemmas. Experimental results indicate that current models struggle with generating direct induction proofs, suggesting that there remains significant room for improvement.
Submission Number: 171
Loading