Ranking LLM-Generated Loop Invariants for Program Verification

Saikat Chakraborty; Shuvendu K Lahiri; Sarah Fakhoury; Akash Lal; Madanlal Musuvathi; Aseem Rastogi; Aditya Senthilnathan; Rahul Sharma; Nikhil Swamy

Ranking LLM-Generated Loop Invariants for Program Verification

Saikat Chakraborty, Shuvendu K Lahiri, Sarah Fakhoury, Akash Lal, Madanlal Musuvathi, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, Nikhil Swamy

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Short Paper

Submission Track: Theme Track: Large Language Models and the Future of NLP

Submission Track 2: NLP Applications

Keywords: Large Language Model, Loop Invariant Synthesis, Re-ranking

TL;DR: LLMs do generate verified Loop Invariant, but often after several unsuccessful trials; Re-ranking the LLMs' generation could save the cost of such unsuccessful trials.

Abstract: Synthesizing inductive loop invariants is fundamental to automating program verification. In this work we observe that Large Language Models (such as {gpt-3.5} or {gpt-4}) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants. This can lead to a large number a calls to a program verifier to establish an invariant. To address this issue, we propose a {re-ranking} approach for the generated results of LLMs. We have designed a ranker that can distinguish between correct inductive invariants and incorrect attempts based on the problem definition. The ranker is optimized as a contrastive ranker. Experimental results demonstrate that this re-ranking mechanism significantly improves the ranking of correct invariants among the generated candidates, leading to a notable reduction in the number of calls to a verifier.

Submission Number: 5266

Loading