FormalAlign: Automated Alignment Evaluation for Autoformalization

Published: 22 Jan 2025, Last Modified: 27 Feb 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language models, Autoformalization, Lean 4, Formal Math, AI for Math
TL;DR: FormalAlign is a framework that automatically evaluates the alignment between informal and formal mathematical proofs, significantly improving performance and reducing reliance on manual verification.
Abstract: Autoformalization aims to convert informal mathematical proofs into machine-verifiable formats, bridging the gap between natural and formal languages. However, ensuring semantic alignment between the informal and formalized statements remains challenging. Existing approaches heavily rely on manual verification, hindering scalability. To address this, we introduce FormalAlign, a framework for automatically evaluating the alignment between natural and formal languages in autoformalization. FormalAlign trains on both the autoformalization sequence generation task and the representational alignment between input and output, employing a dual loss that combines a pair of mutually enhancing autoformalization and alignment tasks. Evaluated across four benchmarks augmented by our proposed misalignment strategies, FormalAlign demonstrates superior performance. In our experiments, FormalAlign outperforms GPT-4, achieving an Alignment-Selection Score 11.58\% higher on \forml-Basic (99.21\% vs. 88.91\%) and 3.19\% higher on MiniF2F-Valid (66.39\% vs. 64.34\%). This effective alignment evaluation significantly reduces the need for manual verification.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1997
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview