SatLM: Satisfiability-Aided Language Models Using Declarative Prompting

Published: 28 Oct 2023, Last Modified: 17 Nov 2023MATH-AI 23 PosterEveryoneRevisionsBibTeX
Keywords: logical reasoning, large language model, satisfiability
Abstract: Prior work has combined chain-of-thought prompting in large language models (LLMs) with programmatic representations to perform reasoning. While such an approach works well for tasks that only require forward reasoning (e.g., straightforward arithmetic), it is less effective for problems that require more sophisticated planning and search. In this paper, we propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of LLMs. We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. By offloading the actual reasoning task to an automated theorem prover, our approach can guarantee the correctness of the answer with respect to the parsed specification and avoid planning errors in the solving process. We evaluate SatLM on 6 datasets and show that it consistently outperforms program-aided LMs in an imperative paradigm. In particular, SatLM outperforms program-aided LMs by more than 20% on a challenging subset of the GSM arithmetic reasoning dataset; SatLM also achieves a new SoTA on LSAT and BoardgameQA.
Submission Number: 32