A Survey of Retentive Network

ACL ARR 2025 May Submission1152 Authors

16 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Retentive Network (RetNet) represents a groundbreaking advancement in neural network architecture, offering an efficient alternative to the Transformer. Through the introduction of a novel retention mechanism that combines the advantages of recurrence and attention, RetNet strikes a balance among strong performance, parallel training capabilities, and low inference cost. This approach effectively addresses the limitations of traditional Transformers, particularly in handling long sequences. As interest in RetNet continues to grow, it has shown impressive performance across various fields such as natural language processing, speech recognition, and time-series analysis. However, a comprehensive review of RetNet is still missing from the current literature. This paper aims to fill that gap by offering the first detailed survey of the RetNet architecture, its key innovations, and its diverse applications. We also explore the main challenges associated with RetNet and propose future research directions to support its continued advancement in both academic research and practical deployment.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: Language Modeling, Retentive Network
Contribution Types: Surveys
Languages Studied: English
Keywords: Language Modeling, Retentive Network
Submission Number: 1152
Loading