DafnyLLM: Pre-training Dafny Representations with Large Language Models for Code Verification
Track: long paper (up to 8 pages)
Keywords: Code LLM, Formal Verification
TL;DR: We propose DafnyLLM, a specialized language model designed to automate formal verification by bridging the gap between code implementation and logical specifications.
Abstract: Formal verification requires a deep integration of program semantics and logical specifications, posing challenges beyond those addressed by general-purpose code language models. In particular, the Dafny language introduces a tightly coupled syntax of implementation and specification, demanding models that can capture both procedural structure and verification intent. In this work, we propose DafnyLLM, the first pre-trained large language model tailored for the Dafny verification language. DafnyLLM leverages hybrid structural priors, including abstract syntax trees, control/data flow graphs, and assertion flow graphs, to encode rich program and specification semantics. To support Dafny’s long-range dependencies across intertwined definitions, specifications, and proofs, we design a sparse attention mechanism with memory and bridge tokens, enabling efficient and scalable modeling of verification contexts. We pre-train DafnyLLM on a curated corpus of real-world Dafny programs and evaluate it on verification-aware downstream tasks, such as specification-conditioned code retrieval and assertion classification. Experimental results show that DafnyLLM outperforms state-of-the-art code models by a significant margin, demonstrating the importance of incorporating verification-specific structure in code representation learning.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Shentong_Mo1
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 55
Loading