RFIR: A Lightweight Network for Retinal Fundus Image Restoration

Limai Jiang, Yi Pan, Yunpeng Cai

Published: 01 Jan 2024, Last Modified: 01 Jul 2025ISBRA (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Retinal fundus images can be utilized for the diagnosis and screening of ocular or other diseases. However, not all devices and operators can directly obtain high-quality retinal images. Low resolution or poor quality significantly hinders medical diagnosis, adversely affecting clinical or downstream tasks. Furthermore, medical datasets lack the vast quantity of data characteristic of natural images. Transformers, with their numerous parameters, are prone to data scarcity issues, hindering efficient reconstruction. Addressing these challenges, we introduce a lightweight network, RFIR, featuring a Dynamic Multi-Head Self-Attention (D-MSA) module, using dynamic convolution as its core and adopting depth-wise convolution to simulate a window mechanism for extracting local features, and also incorporating a Sparse Spatial Self-Attention (SSSA) mechanism, which computes a global attention map, selecting spatial regions of highest contribution to simultaneously capture local details and global dependencies. RFIR leverages a Feed-Forward Network (FFN) to aggregate these features, learning a deep residual to enhance the resolution or remove blurriness from retinal fundus images in a data-efficient manner. Extensive experiments demonstrate that, compared to other state-of-the-art (SOTA) image restoration methods, RFIR achieves superior results with fewer parameters and computational requirements, delivering enhanced visual outcomes.