A Survey of Retentive Network

ACL ARR 2026 January Submission706 Authors

24 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Language Modeling, Retentive Network
Abstract: The Retentive Network (RetNet) has recently emerged as a formidable successor to the Trans- former architecture. Although the self-attention mechanism excels at capturing global depen- dencies, its inherent quadratic complexity im- poses significant memory constraints and in- hibits scalability during long-sequence mod- eling. To overcome these challenges, RetNet introduces an innovative retention mechanism that integrates the inductive bias of recurrent neural networks with the parallelizable train- ing advantages of attention-based models. This unified representation allows RetNet to achieve constant-time inference and linear-time training without sacrificing representational capacity. Despite the growing body of research demon- strating the efficacy of RetNet across diverse fields such as natural language processing, com- puter vision, and time-series analysis, a system- atic synthesis of the current literature is cur- rently unavailable. This paper presents the first comprehensive survey of Retentive Networks through a detailed examination of its architec- tural foundations, core innovations, and special- ized variants. Furthermore, we provide a multi- disciplinary analysis of its applications ranging from basic sequence tasks to complex cross- modal scenarios. Finally, we offer prospective insights and suggest strategic avenues for fu- ture inquiry to facilitate the continued evolution of RetNet in both academic research and large- scale industrial applications.
Paper Type: Long
Research Area: LLM Efficiency
Research Area Keywords: Language Modeling, Retentive Network
Contribution Types: Surveys
Languages Studied: English
Submission Number: 706
Loading