Direct Induction Proof Challenge: Evaluating Large Language Models on Deeply Nested Mathematical Induction
Keywords: mathematical induction, large language model, proof assistant, automated theorem proving, informal proof
Abstract: We introduce a challenge designed to evaluate the capability of Large Language Models (LLMs) in performing mathematical induction proofs, with a particular focus on nested induction.
Our task requires models to construct direct induction proofs in both formal and informal settings, without relying on any preexisting lemmas.
Experimental results indicate that current models struggle with generating direct induction proofs, suggesting that there remains significant room for improvement.
Submission Number: 171
Loading