Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs

Published: 10 Jun 2025, Last Modified: 15 Jul 2025MOSS@ICML2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: SSMs, Linear Recurrent Neural Networks, State-tracking, expressivity
Abstract: Recent work has shown that LRNN models such as S4D, Mamba, and DeltaNet lack state-tracking capability due to either time-invariant transition matrices or restricted eigenvalue ranges. To address this, input-dependent transition matrices, particularly those that are complex or non-triangular, have been proposed to enhance SSM performance on such tasks. While existing theorems demonstrate that both input-independent and non-negative SSMs are incapable of solving simple state-tracking tasks like parity, regardless of depth, they do not explore whether combining these two types in a multilayer SSM could help. We investigate this question for efficient SSMs with diagonal transition matrices and show that such combinations still fail to solve parity. This implies that a recurrence layer must be both input-dependent and include negative eigenvalues. Our experiments support this conclusion by analyzing an SSM model that combines S4D and Mamba layers.
Code: ipynb
Submission Number: 92
Loading