Log-Normal State-Space Model

03 Sept 2025 (modified: 08 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Linear attention, State-space model, Log-normal distribution, Long- range dependency, Sequence modeling
TL;DR: An efficient state space model achieving high performance without extra MLPs.
Abstract: State space model (SSM) have emerged as a strong alternative to transformer owing to its linear-time complexity and state retention mechanism where the computation efficiency and memory capability are enhanced especially in long-sequence tasks. However, the features derived from state updates in SSM still exhibit weaker representation than those generated by self-attention in transformer. In this work, we propose a new architecture that preserves the linear-time efficiency of SSMs while enabling state-update features to approach the expressiveness of self-attention, thereby achieving both computation efficiency and memory enhancement. Our code is available at https://anonymous.4open.science/r/Log-Normal-State-Space-Model-8301/.gitignore
Primary Area: foundation or frontier models, including LLMs
Submission Number: 1709
Loading