Multi-scale transformer with conditioned prompt for image deraining

Xianhao Wu, Hongming Chen, Xiang Chen, Guili Xu

Published: 2025, Last Modified: 13 Mar 2025Digit. Signal Process. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recently, vision Transformers have made significant advancements in image deraining due to their ability to model non-local information. However, most existing methods do not fully explore and utilize the multi-scale properties of rain streaks, which are crucial for achieving high-quality image reconstruction. To address this limitation, we propose an effective image deraining method called MSPformer, which is based on a multi-scale Transformer with conditioned prompt. Specifically, MSPformer consists of two parallel branches, i.e., a base network and a condition network. Motivated by the recent wave of prompt learning, our condition network employs soft prompts to encode diverse rain degradation information, which is then used to dynamically modulate the base network in the deraining process. Furthermore, we also develop a multi-scale feature prompt fusion method that enables representations learned at different scales to effectively communicate with each other. Extensive experiments demonstrate that the proposed framework performs favorably against the state-of-the-art approaches on both synthetic and real-world benchmarks.