Abstract: Point cloud contents represent one of the prevalent formats for 3D representations. Distortions introduced at various stages in the
point cloud processing pipeline affect the visual quality, altering their geometric composition, texture information, or both. Understanding
and quantifying the impact of the distortion domain on visual quality is vital to driving rate optimization and guiding postprocessing
steps to improve the overall quality of experience. In this paper, we propose a multi-task guided multi-modality no reference
metric for measuring the quality of colored point clouds (M3-Unity), which utilizes 4 types of modalities across different attributes and
dimensionalities to represent point clouds. An attention mechanism establishes inter/intra associations among 3D/2D patches,
which can complement each other, yielding both local and global features, to fit the highly nonlinear property of the human vision
system. A multi-task decoder involving distortion type classification selects the best combination among 4 modalities based on the
specific distortion type, aiding the regression task and enabling the in-depth analysis of the interplay between geometrical and textural
distortions. Furthermore, our framework design and attention strategy enable us to measure the impact of individual attributes and
their combinations, providing insights into how these associations contribute particularly in relation to distortion type. Experimental
results demonstrate that our method effectively predicts the visual quality of point clouds, achieving state-of-the-art performance on
four benchmark datasets. The code will be released.
Primary Subject Area: [Experience] Interactions and Quality of Experience
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: Our topic is objective point cloud quality assessment. The result from our experiments and analysis can be used for point cloud compression and other tasks, which can improve the whole quality of experience from the user persepctive. The analysis will be helpful for MPEG and JPEG community and other perceptual quality-related tasks.
Submission Number: 1036
Loading