Abstract: This paper aims to introduce 3D Gaussians for efficient, expressive, and editable digital avatar generation.This task faces two major challenges: 1) The unstructured nature of 3D Gaussians makes it incompatible with current generation pipelines; 2) the animation of 3D Gaussians in a generative setting that involves training with multiple subjects remains unexplored. In this paper, we propose a novel avatar generation method named $E^{3}$Gen, to effectively address these challenges. First, we propose a novel generative UV features representation that encodes unstructured 3D Gaussians onto a structured 2D UV space defined by the SMPLX parametric model. This novel representation not only preserves the representation ability of the original 3D Gaussians but also introduces a shared structure among subjects to enable generative learning of the diffusion model. To tackle the second challenge, we propose a part-aware deformation module to achieve robust and accurate full-body expressive pose control. Extensive experiments demonstrate that our method achieves superior performance in avatar generation and enables expressive full-body pose control and editing.
Primary Subject Area: [Generation] Generative Multimedia
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: This work aims to create 3D digital avatars that enable efficient rendering(real-time, high-resolution, and realistic rendering), full control over pose animation(including facial expressions and gestures), and local region editing. The generated digital avatars have extensive applications in multimedia domains such as video games, telecommunication, and VR/AR. Our work also supports multimedia applications such as audio-driven avatar animation.
Supplementary Material: zip
Submission Number: 3821
Loading