QMamba: On First Exploration of Vision Mamba for Image Quality Assessment

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0
Abstract: In this work, we take the first exploration of the recently popular foundation model, *i.e.,* State Space Model/Mamba, in image quality assessment (IQA), aiming at observing and excavating the perception potential in vision Mamba. A series of works on Mamba has shown its significant potential in various fields, *e.g.,* segmentation and classification. However, the perception capability of Mamba remains under-explored. Consequently, we propose QMamba by revisiting and adapting the Mamba model for three crucial IQA tasks, *i.e.,* task-specific, universal, and transferable IQA, which reveals its clear advantages over existing foundational models, *e.g.,* Swin Transformer, ViT, and CNNs, in terms of perception and computational cost. To improve the transferability of QMamba, we propose the StylePrompt tuning paradigm, where lightweight mean and variance prompts are injected to assist task-adaptive transfer learning of pre-trained QMamba for different downstream IQA tasks. Compared with existing prompt tuning strategies, our StylePrompt enables better perceptual transfer with lower computational cost. Extensive experiments on multiple synthetic, authentic IQA datasets, and cross IQA datasets demonstrate the effectiveness of our proposed QMamba.
Lay Summary: Assessing how good an image looks is surprisingly difficult for computers. Traditional models often miss small visual flaws or use too much computing power to detect them. Our research investigates a new type of AI model called a State Space Model (specifically, “Mamba”) to improve how machines measure image quality. We introduce a model called QMamba that can not only spot subtle image problems more accurately but also run much more efficiently than existing systems. To make this model work well across different types of images, we developed a lightweight tuning method called StylePrompt. This approach adapts the model to new tasks by adjusting only a few key features, avoiding the need to retrain the entire system. Our experiments show that QMamba performs strongly on a wide range of image quality tasks, including challenging cases with synthetic, real-world, and AI-generated distortions. This work could help improve visual quality in areas like photo enhancement, image compression, and AI-generated media, where accurate quality assessment is essential.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Deep Learning->Foundation Models
Keywords: Image Quality Assessment, State Space Model, Prompt Tuning
Submission Number: 1594
Loading