Towards Understanding Gated Linear Recurrent Neural Networks

Towards Understanding Gated Linear Recurrent Neural Networks

ICLR 2026 Conference Submission16349 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Recurrent Neural Network, Gate mechanism

Abstract: Linear Recurrent Neural Networks (RNNs) have attracted attention for their memory and computational efficiency. In particular, gated linear RNNs enable nonlinear transformations through gating mechanisms while still maintaining linear time complexity by removing hidden states from them. However, the impact of the gate mechanisms and such removal of hidden states from them remains unexplored. Here we empirically investigate the impact of these gating mechanisms and find that gate values near zero or one highly depend on hidden states, leading to unintended distribution shifts of gate values when hidden states are removed in gated linear RNNs. Based on our findings, we propose an algorithm to mitigate the distribution shifts, which empirically improves performance on long-sequence modeling tasks.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 16349

Loading