Topology of Attention: Detecting Hallucinations in Code Generation Models

Topology of Attention: Detecting Hallucinations in Code Generation Models

ACL ARR 2025 February Submission7851 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: While the AI-code assistant tools become widespread, automatic assessment of the correctness of the generated code becomes a significant challenge. Code-generating LLMs are prone to hallucinations, which may lead to code that doesn't solve a required problem or even to code with severe security vulnerabilities. In this paper, we propose a new approach to assessment of code correctness. Our solution is based on topological data analysis (TDA) of attention maps of code LLMs. We carry out experiments with two benchmarks - HumanEval, MBPP and 5 code LLMs: StarCoder2-7B, CodeLlama-7B, DeepSeek-Coder-6.7B, Qwen2.5-Coder-7B, and Magicoder-S-DS-6.7B. Experimental results show that the proposed method is better than several baselines. Moreover, the trained classifiers are transferable between coding benchmarks.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: NLP Applications, Generation, Generalization of NLP Models

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: Python

Submission Number: 7851

Loading