LGFN: Lightweight Light Field Image Super-Resolution using Local Convolution Modulation and Global Attention Feature Extraction

Published: 01 Jan 2024, Last Modified: 18 Apr 2025CVPR Workshops 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Capturing different intensity and directions of light rays at the same scene, Light field (LF) can encode the 3D scene cues into a 4D LF image, which has a wide range of applications (i.e., post-capture refocusing and depth sensing). LF image super-resolution (SR) aims to improve the image resolution limited by the performance of LF camera sensor. Although existing methods have achieved promising results, the practical application of these models is limited because they are not lightweight enough. In this paper, we propose a lightweight model named LGFN, which integrates the local and global features of different views and the features of different channels for LF image SR. Specifically, owing to neighboring regions of the same pixel position in different sub-aperture images exhibit similar structural relationships, we design a lightweight CNN-based feature extraction module (namely, DGCE) to extract local features better through feature modulation. Meanwhile, as the position beyond the boundaries in the LF image presents a large disparity, we propose an efficient spatial attention module (namely, ESAM) which uses decomposable large-kernel convolution to obtain an enlarged receptive field and an efficient channel attention module (namely, ECAM). Compared with the existing LF image SR models with large parameter, our model has a parameter of 0.45M and a FLOPs of 19.33G, which has achieved a competitive effect. Extensive experiments with ablation studies demonstrate the effectiveness of our proposed method, which ranked the second place in the Track 2 Fidelity & Efficiency of NTIRE2024 Light Field Super Resolution Challenge and the seventh place in the Track 1 Fidelity.
Loading