HIGH-AVATAR: Hierarchical Representation for One-shot Gaussian Head Avatar

03 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: 3D Head Avatar, Gaussian Splatting, One-shot Avatar Generation
Abstract: We propose HIGH-Avatar, a novel one-shot method that leverages a $\textbf{HI}$erarchical representation for animatable 3D $\textbf{G}$aussian $\textbf{H}$ead reconstruction from a single image. In contrast to existing approaches with a fixed number of Gaussians, our method enables multi-LOD (Level-of-Detail) head avatar modeling using a unified model. To capture both global and local facial characteristics, we employ a transformer-based architecture for global feature extraction and projection-based sampling for local feature acquisition. These features are effectively fused under the guidance of a depth buffer, ensuring occlusion plausibility. A coarse-to-fine learning strategy is introduced to enhance training stability and improve the perception of hierarchical details. To address the limitations of 3DMMs in modeling non-head regions such as the shoulders, we introduce a multi-region decomposition scheme, where the head and shoulders are predicted separately and then integrated through cross-region combination. Extensive experiments demonstrate that HIGH-Avatar outperforms state-of-the-art methods in terms of reconstruction quality, reenactment performance, and computational efficiency.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 1213
Loading