BRAD: Enhancing Code summarization with Bytecode Retrieval-Augmented Deliberation model

ACL ARR 2026 January Submission8565 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Code Summarization, Deliberation Network, Retrieval-Augmented, Bytecode
Abstract: Comments and summaries play a critical role in program comprehension. However, high-quality comments are often missing in large codebases. Existing code summarization methods improve linguistic fluency but still fall short in capturing program execution semantics. Furthermore, source-level retrieval methods struggle to identify code examples that are syntactically different but functionally equivalent. To address these limitations, we propose BRAD (Bytecode Retrieval-Augmented Deliberation), a novel retrieval-augmented deliberation model for code summarization that integrates bytecode-level semantics with multi-pass deliberation. Specifically, BRAD (1) encodes bytecode control-flow graphs (CFG) with a Graph Attention Network (GAT) to preserve control flow semantics, and (2) retrieves exemplar drafts in the bytecode text space to find functionally similar references. These are then fused with textual encodings in a deliberative refinement process to generate the final summary. We evaluate BRAD on a public Java dataset, and the results showed that compared with the strong baselines, BRAD has increased by 18.7% in BLEU-1, 16.2% in BLEU-2, 8.5% in BLEU-3, 2.8% in BLEU-4, 13.4% in ROUGE-L, and 10.0% in CIDEr.
Paper Type: Long
Research Area: Code Models
Research Area Keywords: code summarization
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 8565
Loading