AI-Assisted Autoformalization of Combinatorics Problems in Proof Assistants

Long Doan, ThanhVu Nguyen

Published: 2025, Last Modified: 27 Jan 2026NIER@ICSE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Proof assistants such as Coq and LEAN have been increasingly used by renowned mathematicians to formalize and prove mathematical theorems. Despite their growing use, writing formal proofs is challenging, and even the first step of stating the problem formally is difficult as it requires a deep understanding of these systems’ languages. Recent advancements in AI, especially large language models (LLMs), have shown promise in automating this formalization task. However, domains such as combinatorics pose significant challenges for AI-assisted proof assistant systems due to their cryptic nature and the lack of existing data to train AI models. We introduce AutoForm4Lean, a system designed to leverage LLMs to aid in formalizing combinatorics problems for LEAN. By combining LLMs with techniques from software engineering and formal methods such as validation and synthesis, AutoForm4Lean generates formalizations of combinatorics problems more effectively than the current state-of-the-art LLMs. Moreover, this project seeks to provide a comprehensive collection of formalized combinatorics problems, theorems, and lemmas, which would enrich the LEAN library and provide valuable training data for LLMs. Preliminary results demonstrate the effectiveness of AutoForm4Lean in formalizing combinatorics problems in LEAN, making a step forward in AI-based theorem proving.

External IDs:dblp:conf/icse/DoanN25