Mesh-Centric Gaussian Splatting for Human Avatar Modelling with Real-time Dynamic Mesh Reconstruction
Abstract: Real-time mesh reconstruction is highly demanded for integrating human avatar in modern computer graphics applications. Current methods typically use coordinate-based MLP to represent 3D scene as Signed Distance Field (SDF) and optimize it through volumetric rendering, relying on Marching Cubes for mesh extraction. However, volumetric rendering lacks training and rendering efficiency, and the dependence on Marching Cubes significantly impacts mesh extraction efficiency. This study introduces a novel approach, Mesh-Centric Gaussian Splatting (MCGS), which introduces a unique representation Mesh-Centric SDF and optimizes it using high-efficiency Gaussian Splatting. The primary innovation introduces Mesh-Centric SDF, a thin layer of SDF enveloping the underlying mesh, and could be efficiently derived from mesh. This derivation of SDF from mesh allows for mesh optimization through SDF, providing mesh as 0 iso-surface, and eliminating the need for slow Marching Cubes. The secondary innovation focuses on optimizing Mesh-Centric SDF with high-efficiency Gaussian Splatting. By dispersing the underlying mesh of Mesh-Centric SDF into multiple layers and generating Mesh-Constrained Gaussians on them, we create Multi-Layer Gaussians. These Mesh-Constrained Gaussians confine Gaussians within a 2D surface space defined by mesh, ensuring an accurate correspondence between Gaussian rendering and mesh geometry. The Multi-Layer Gaussians serve as sampling layers of Mesh-Centric SDF and can be optimized with Gaussian Splatting, which would further optimize Mesh-Centric SDF and its underlying mesh. As a result, our method can directly optimize the underlying mesh through Gaussian Splatting, providing fast training and rendering speeds derived from Gaussian Splatting, as well as precise surface learning of SDF. Experiments demonstrate that our method achieves dynamic mesh reconstruction at over 30 FPS. In contrast, SDF-based methods using Marching Cubes achieve less than 1 FPS, and concurrent 3D Gaussian Splatting-based methods cannot extract reasonable mesh.
Primary Subject Area: [Generation] Generative Multimedia
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: Human avatars play a pivotal role in immersive telepresence and metaverse, which are prevalent applications of multimedia and multimodal technologies. Representing human avatars as meshes is crucial for enhancing industry engagement as it enables various applications such as texture editing, model sculpting, animation, and Physically Based Rendering. Despite existing methods often requiring expensive 3D ground truth data or being slow in reconstructing meshes, this work introduces a method with real-time mesh reconstruction capabilities and realistic image rendering. This advancement is poised to significantly push the boundaries of immersive telepresence and metaverse technology.
Supplementary Material: zip
Submission Number: 495
Loading