Human Interaction-Aware 3D Reconstruction from a Single Image

Gwanghyun Kim; Junghun James Kim; Suh Yoon Jeon; Jason Park; Se Young Chun

Human Interaction-Aware 3D Reconstruction from a Single Image

Gwanghyun Kim, Junghun James Kim, Suh Yoon Jeon, Jason Park, Se Young Chun

Published: 07 May 2026, Last Modified: 07 May 2026PhysHuman Workshop @ CVPR 2026 OralEveryoneRevisionsCC BY 4.0

Keywords: textured 3D human models, multi-human scenes, geometric distortions, occlusions and proximity

Abstract: Reconstructing textured 3D human models from a single image is fundamental for AR/VR and digital human applications. However, existing methods mostly focus on single individuals and thus fail in multi-human scenes, where naive composition leads to artifacts such as unrealistic overlaps, missing geometry in occluded regions, and distorted interactions. To address this, we propose HUG3D, a holistic framework that explicitly models both group- and instance-level information. To mitigate perspective-induced geometric distortions, we first transform the input into a canonical orthographic space. Then Human Group-Instance Multi-View Diffusion (HUG-MVD) generates complete multi-view normals and images by jointly modeling individuals and group context. Subsequently, the Human Group-Instance Geometric Reconstruction (HUG-GR) module optimizes the geometry by leveraging physics-based interaction priors to accurately model inter-human contact, followed by high-fidelity texture reconstruction. Extensive experiments show that HUG3D significantly outperforms prior methods, producing physically plausible, high-fidelity 3D reconstructions of interacting people from a single image.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 7

Loading