Datasets, Formalizers, and Provers: A Survey of Formal Mathematical Reasoning with LLMs

ACL ARR 2026 January Submission6242 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Formal Mathematical Reasoning
Abstract: Recent advances in large language models (LLMs) have significantly expanded the capabilities of automated mathematical reasoning. This survey reviews recent progress in formal mathematical reasoning with LLMs from three interconnected perspectives: datasets, formalizers, and provers. We provide a comparative analysis of key benchmarks from 2022 to 2025, highlighting dataset properties and model performance trends. We argue that future progress will require richer and more diverse benchmarks, evaluation protocols that emphasize semantic correctness and robustness, and deeper alignment between informal mathematical intuition and formal symbolic structure.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: mathematical NLP, educational applications
Contribution Types: Surveys
Languages Studied: English, Lean
Submission Number: 6242
Loading