Joint Learning of Point Clouds and Motion Vectors for Volumetric Video

Cheng-Tse Lee, Yuan-Chun Sun, Yuang Shi, Mufeng Zhu, Wei Tsang Ooi, Yao Liu, Chun-Ying Huang, Cheng-Hsin Hsu

Published: 01 Jan 2025, Last Modified: 05 Oct 2025MMVE@MMSys 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Dynamic point clouds are simple and versatile representations for volumetric video. A challenge in processing and analyzing dynamic point clouds is the lack of explicitly defined structure and motion information across frames. This paper addresses a novel motion prediction problem that jointly learns point clouds and motion vectors by moving the points from a key frame to construct the following non-key frames while approximating the rendered 2D views of the input non-key frames. Our motion prediction algorithm is built upon augmented dynamic 3D Gaussian Splatting (3DGS) training algorithms with a lightweight point cloud converter and optimal parameter selector. Extensive experiments show that our resulting motion vectors lead to a PSNR (Peak Signal-to-Noise Ratio) gain of up to 8.71 dB and SSIM (Structural Similarity Index Measure) increase of up to 0.27, compared to the current practice. The computed motion vectors can be leveraged in multiple downstream applications, such as error concealment, temporal super-resolution, and source coding. Using the learned motion vectors for error concealment, we observe quality improvement by, at most, 3.93 dB in PSNR and 0.28 in SSIM, compared to a state-of-the-art end-to-end neural network.

External IDs:dblp:conf/mmve/LeeSSZO0HH25