Consistent 3D Human Reconstruction from Monocular Video: Learning Correctable Appearance and Temporal Motion Priors

Cheng Shang, Liang An, TingTing Li, Jiajun Zhang, Yuxiang Zhang, Jidong Tian, Yebin Liu, Xubo Yang

Published: 01 Jan 2025, Last Modified: 12 Jan 2026IEEE Transactions on Visualization and Computer GraphicsEveryoneRevisionsCC BY-SA 4.0
Abstract: Recent advancements in rendering dynamic humans using NeRF and 3D Gaussian splatting have made significant progress, leveraging implicit geometry learning and image appearance rendering to create digital humans. However, in monocular video rendering, there are still challenges in rendering subtle and complex motion from different viewpoints and states, primarily due to the imbalance of viewpoints. Additionally, ensuring continuity between adjacent frames when rendering from novel and free viewpoints remains a difficult task. To address these challenges, we first propose a pixel-level motion correction module that adjusts the errors in the learned representation between different viewpoints. We also introduce a temporal information-based model to improve motion continuity by leveraging adjacent frames. Experimental results on dynamic human rendering, using the NeuMan, ZJU-Mocap, and People- Snapshot datasets, demonstrate that our method outperforms state-of-the-art techniques both quantitatively and qualitatively.
Loading