NLIR: Natural Language Intermediate Representation for Mechanized Theorem Proving

Laetitia Teodorescu, Guillaume Baudart, Emilio Jesús Gallego Arias, Marc Lelarge

Published: 25 Sept 2024, Last Modified: 30 Sept 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Formal theorem proving is challenging for humans as well as for machines. Thanks to recent advances in LLM capabilities, we believe natural language can serve as a universal interface for reasoning about formal proofs. In this paper, 1) we introduce Pétanque, a new lightweight environment to interact with the Coq theorem prover; 2) we present two interactive proof protocols leveraging natural language as an intermediate representation for designing proof steps; 3) we implement beam search over these interaction protocols, using natural language to rerank proof candidates; and 4) we use Pétanque to benchmark our search algorithms. Using our method with GPT-4o we can successfully synthesize proofs for 46% of the Logical Foundation series and for 50% of the first 100/260 lemmas from the newly published Busy Beaver proofs.