Mitigating Over-Smoothing in Mamba2 via Spectral Domain Analysis

Published: 11 Jun 2025, Last Modified: 10 Jul 2025ES-FoMo IIIEveryoneRevisionsBibTeXCC BY 4.0
Keywords: state space model, over smoothing, mamba, spectral domain analysis
TL;DR: We analyze Mamba2 through spectral analysis and reveal its inherent low-pass filtering behavior, leading to over-smoothing.
Abstract: Mamba2, a rising contender against transformer-based architectures, has garnered significant attention for its impressive performance across diverse tasks, sparking a wave of research into its analysis and improvement. In this paper, we investigate Mamba2 through the lens of spectral analysis, uncovering a critical structural bias: Mamba2 inherently functions as a low-pass filter, leading to over-smoothing. Over-smoothing, where token representations become overly uniform, hampers the model’s ability to capture rich and diverse features, ultimately contributing to performance degradation. To address this, we propose a straightforward yet effective high-frequency enhancement method. By selectively amplifying high-frequency components at the layer level, our approach mitigates the over-smoothing effect, restoring token diversity and improving representational richness. Experiments confirm the efficacy of our method, demonstrating its ability to enhance Mamba2’s performance across key tasks.
Submission Number: 46
Loading