Keywords: Repository-Level Code Completion, Code LLMs, Query Refinement, Rerank, Dynamic Attention Mechanism
Abstract: Repository-level code completion, which leverages the entire codebase to generate suggestions, is crucial for enhancing developer productivity. While retrieval-augmented generation (RAG) with code large language models (code LLMs) has become the standard approach for this task, existing methods still suffer from several problems: they struggle with uninformative query based on incomplete code, overlook the impact of prompt structure and snippet ordering, and retrieve textually similar but functionally different code. These issues collectively introduce noise and lead to ineffective context for code LLMs. To address these problems, we introduce RepoAttention, a training-free RAG-based framework for repository-level code completion. It improves retrieval and completion performance through dual-path query refinement, relevance-aware reranking of retrieved snippets, and a dynamic relevance-guided attention mechanism. Experiments on CCEval and RepoEval show that RepoAttention surpasses state-of-the-art methods by 23.9% in Exact Match accuracy and generalizes well across multiple code LLMs and programming languages.
Paper Type: Long
Research Area: Code Models
Research Area Keywords: code completion, code retrieval, retrieval-augmented generation, code generation and understanding
Contribution Types: NLP engineering experiment
Languages Studied: programming language, english
Submission Number: 7275
Loading