Multidimension Attention Network for Full-Reference Light Field Image Quality Assessment

Published: 01 Jan 2025, Last Modified: 24 Oct 2025IEEE Trans. Instrum. Meas. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Light field (LF) imaging enables the postcapture adjustments in focus and view, which facilitates many new applications. However, various types of distortions and noises are inevitably introduced during LF image (LFI) generation and processing procedures. It is crucial to assess the perceptual quality of distorted LFIs to promote the development of LF imaging. Most existing full-reference (FR) LF quality assessment methods are inherently limited as they overlook the global and local dependencies of the cropped image patches. To mitigate this issue, this article proposes a multidimensional attention network for FR LFI quality assessment (LF-IQA). Specifically, we first propose a multilevel Chebyshev stack (MLCS) structure that reorganizes LF views into auxiliary view stacks under the same Chebyshev distances along horizontal, vertical, left-, and right-diagonal directions. By constraining the Chebyshev distances, the view correlation can be kept consistent in each angular direction. Then, a multidimensional attention network is put forward, which consists of three components, including the Vision Transformer (ViT) feature extraction module, the multidimensional discrepancy transformer module, and the quality regression module. By aggregating the ViT and the Swin Transformer, the proposed network is able to allow adequate information interactions of the reference, distorted, and the calculated discrepancy features globally and locally. Comprehensive experiments on three public LFI quality evaluation datasets illustrate the superiority of the proposed method. The code is available at https://github.com/ldyorchid/LF-IQA
Loading