Multitask Deep Neural Network With Knowledge-Guided Attention for Blind Image Quality Assessment

Published: 01 Jan 2024, Last Modified: 16 Apr 2025IEEE Trans. Circuits Syst. Video Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Blind image quality assessment (BIQA) targets predict the perceptual quality of an image without any reference information. However, known methods have considerable room for performance improvement due to limited efforts in distortion knowledge usage. This paper proposes a novel multitask learning based BIQA method termed KGANet, which takes image distortion classification as an auxiliary task and uses the knowledge learned from the auxiliary task to assist accurate quality prediction. Different from existing CNN-based methods, KGANet adopts a transformer as the backbone for feature extraction, which can learn more powerful and robust representations. Specifically, it comprises two essential components: a cross-layer information fusion (CIF) module and a knowledge-guided attention (KGA) module. Considering that both global and local distortions appear in an image, CIF fuses the features of the adjacent layers extracted by the backbone to obtain a multiscale feature representation. KGA incorporates the distortion probability estimated by the auxiliary task with the distortion embeddings, which are selected from subword unit embeddings based on a textual template, to form distortion knowledge. This knowledge further serves as guidance to enhance the features of each layer and strengthen the connection between the main and auxiliary task. We demonstrate the effectiveness of the proposed KGANet through extensive experiments on benchmark databases. Experimental results show that KGANet correlates well with subjective perceptual judgments and achieves superior performance over 12 state-of-the-art BIQA methods.
Loading