Real&Synthetic Dataset and the Linear Attention in Image Restoration

Yuzhen Du; Teng Hu; Jiangning Zhang; Ran Yi; Chengming Xu; Xiaobin Hu; Kai WU; Donghao Luo; Yabiao Wang; Lizhuang Ma

Real&Synthetic Dataset and the Linear Attention in Image Restoration

Yuzhen Du, Teng Hu, Jiangning Zhang, Ran Yi, Chengming Xu, Xiaobin Hu, Kai WU, Donghao Luo, Yabiao Wang, Lizhuang Ma

23 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Image Restoration, Vision-RWKV

Abstract: Image restoration (IR), which aims to recover high-quality images from degraded inputs, is a crucial task in modern image processing. Recent advancements in deep learning, particularly with Convolutional Neural Networks (CNNs) and Transformers, have significantly improved image restoration performance. However, existing methods lack a unified training benchmark that specifies the training iterations and configurations. Additionally, we construct an image complexity evaluation metric using the gray-level co-occurrence matrix (GLCM) and find that there exists a bias between the image complexity distributions of commonly used IR training and testing datasets, leading to suboptimal restoration results. Therefore, we construct a new large-scale IR dataset called ReSyn, that utilizes a novel image filtering method based on image complexity to achieve a balanced image complexity distribution, and contains both real and AIGC synthetic images. From the perspective of measuring the model's convergence ability and restoration capability, we construct a unified training standard that specifies the training iterations and configurations for image restoration models. Furthermore, we explore how to enhance the performance of transformer-based image restoration models based on linear attention mechanism. We propose RWKV-IR, a novel image restoration model that incorporates the linear complexity RWKV into the transformer-based image restoration structure, and enables both global and local receptive fields. Instead of directly integrating the Vision-RWKV into the transformer architecture, we replace the original Q-Shift in RWKV with a novel Depth-wise Convolution shift, which effectively models the local dependencies, and is further combined with Bi-directional attention to achieve both global and local aware linear attention. Moreover, we propose a Cross-Bi-WKV module that combines two Bi-WKV modules with different scanning orders to achieve a balanced attention for horizontal and vertical directions. Extensive experiments demonstrate the effectiveness and competitive performance of our RWKV-IR model.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2816

Loading