MFT-PCQA: Multi-Modal Fusion Transformer for No-Reference Point Cloud Quality Assessment

Published: 01 Jan 2024, Last Modified: 13 Nov 2024ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The multi-modal information fusion for point cloud quality assessment (PCQA) is still understudied in existing work. Previous methods mostly adopt a late-fusion strategy without fully exploiting the advantages of different modalities and integrating them effectively. Considering that there exist both segregated processing and intertwined fusion when the human visual system (HVS) tackles different types of information, we propose a novel fusion transformer module for PCQA (MFT-PCQA). Specifically, we block the attention between point cloud features and image features to protect the self-attention of each modality, and utilize a mediate-fusion strategy to promote the cross-attention between modalities, encouraging the network to extract crucial interactions. Experimental results show that the proposed method outperforms state-of-the-art PCQA approaches.
Loading