CryptoFormalEval: Integrating Large Language Models and Formal Verification for Automated Cryptographic Protocol Vulnerability Detection

Cristian Curaba; D'Ambrosi Denis; Alessandro Minisini

CryptoFormalEval: Integrating Large Language Models and Formal Verification for Automated Cryptographic Protocol Vulnerability Detection

Cristian Curaba, D'Ambrosi Denis, Alessandro Minisini

Published: 10 Oct 2024, Last Modified: 28 Oct 2024Sys2-Reasoning PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Security Protocols, Vulnerabilities, Tamarin, Large Language Models, Evaluation, LLM-based agent

TL;DR: CryptoFormaLLM integrates large language models with formal verification to automate the detection of vulnerabilities in cryptographic protocols.

Abstract: Cryptographic protocols play a fundamental role in securing modern digital infrastructures, but are often deployed without prior formal verification, potentially leading to the widespread adoption of distributed systems vulnerable to unforeseen attack vectors. Formal verification methods, on the other hand, often require the application of complex and time-consuming techniques that lack automatization. In this paper, we introduce a benchmark to assess the ability of Large Language Models (LLMs) to autonomously identify vulnerabilities in new cryptographic protocols through interaction with Tamarin, a powerful theorem prover for protocol verification. We propose a manually validated dataset of novel, flawed, communication protocols and we design a method to automatically verify the vulnerabilities found by the AI agents. We provide some early results about the performances of the current frontier models on the benchmark, leveraging state-of-the-art prompting and scaffolding techniques to boost their capabilities. With this paper, we aim to provide valuable insights about the possibility of innovative cybersecurity applications obtainable by integrating LLMs with symbolic reasoning systems.

Submission Number: 59

Loading