Integrating State Space Model and Transformer for Global-Local Processing in Super-Resolution Networks

26 Sept 2024 (modified: 12 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Computer Vision and Pattern Recognition, image super-resolution
Abstract: Single image super-resolution aims to recover high-quality images from low-resolution inputs and is a key topic in computer vision. While Convolutional Neural Networks (CNNs) and Transformer models have shown great success in SISR, they have notable limitations: CNNs struggle with non-local information, and Transformers face quadratic complexity in global attention. To address these issues, Mamba models introduce a State Space Model (SSM) with linear complexity. However, recent research shows that Mamba models underperform in capturing local dependencies in 2D images. In this paper, we propose a novel approach that integrates Mamba SSM blocks with Transformer self-attention layers, combining their strengths. We also introduce register tokens and a new SE-Scaling attention mechanism to improve performance while reducing computational costs. The resulting super-resolution network, SST (State Space Transformer), achieves state-of-the-art results on both classical and lightweight tasks.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7328
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview