BRAD: Enhancing Code summarization with Bytecode Retrieval-Augmented Deliberation model

BRAD: Enhancing Code summarization with Bytecode Retrieval-Augmented Deliberation model

ACL ARR 2026 January Submission8565 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Code Summarization, Deliberation Network, Retrieval-Augmented, Bytecode

Abstract: Comments and summaries play a critical role in program comprehension. However, high-quality comments are often missing in large codebases. Existing code summarization methods improve linguistic fluency but still fall short in capturing program execution semantics. Furthermore, source-level retrieval methods struggle to identify code examples that are syntactically different but functionally equivalent. To address these limitations, we propose BRAD (Bytecode Retrieval-Augmented Deliberation), a novel retrieval-augmented deliberation model for code summarization that integrates bytecode-level semantics with multi-pass deliberation. Specifically, BRAD (1) encodes bytecode control-flow graphs (CFG) with a Graph Attention Network (GAT) to preserve control flow semantics, and (2) retrieves exemplar drafts in the bytecode text space to find functionally similar references. These are then fused with textual encodings in a deliberative refinement process to generate the final summary. We evaluate BRAD on a public Java dataset, and the results showed that compared with the strong baselines, BRAD has increased by 18.7% in BLEU-1, 16.2% in BLEU-2, 8.5% in BLEU-3, 2.8% in BLEU-4, 13.4% in ROUGE-L, and 10.0% in CIDEr.

Paper Type: Long

Research Area: Code Models

Research Area Keywords: code summarization

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 8565

Loading