Keywords: Lean 4, Autoformalizing, LLM, Formal System, Dataset
TL;DR: We propose KELPS, a framework that converts informal math to formal statements (Lean/Coq/Isabelle) via symbolic translation
Abstract: Modern large language models (LLMs) show
promising progress in formalizing informal mathematics into machine-verifiable theorems. However, these methods still face bottlenecks due to
the limited quantity and quality of multilingual
parallel corpora. In this paper, we propose KELPS
(Knowledge-Equation based Logical Processing
System), a neuro-symbolic framework for synthesizing multiple high-quality formal languages
(Lean, Coq, and Isabelle) from informal mathematical text. First, we translate natural language
into Knowledge Equations (KEs), a novel language that we designed, theoretically grounded in
assertional logic. Next, we convert them to target
languages through rigorously defined rules that
preserve both syntactic structure and semantic
meaning. This process yielded a parallel corpus
of over 60,000 problems. Our KELPS translator, fine-tuned on this dataset, finally achieves a
96.2% syntactic accuracy (pass@1) on MiniF2F
with one-time automated grammar correction,outperforming SOTA models such as Deepseek-V3 (87.8%) and Herald (90.3%) across multiple datasets.
Submission Number: 95
Loading