Obj3Dify: Occlusion-Invariant 3D Reconstruction of Hand-Held Objects

Published: 2025, Last Modified: 05 Mar 2026IJCNN 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Reconstructing high-quality 3D textured models from single images of hand-held objects is a challenging task due to occlusions, complex interactions between hands and objects, and the need for precise segmentation and texture reconstruction. In this paper, we introduce Obj3Dify, a novel framework that addresses these challenges by integrating state-of-the-art techniques in computer vision, generative modeling, and 3D reconstruction. The pipeline includes object detection and classification with LLaVA-NeXT, segmentation with Lang-SAM, occlusion inpainting with Stable Diffusion XL, and 3D model generation using TRELLIS.We evaluate our framework on the HO3D dataset, leveraging its comprehensive 3D annotations and diverse object shapes for robust benchmarking. Comparative analyses with In-Hand3D and TRELLIS demonstrate that Obj3Dify significantly improves geometric fidelity and reduces noise in 3D reconstructions, achieving results closer to ground-truth models. An ablation study further validates the contributions of each pipeline stage, highlighting the critical role of segmentation and occlusion inpainting in enhancing 3D model quality. Furthermore, a multiview approach is implemented and tested, demonstrating its usefulness for generating asymmetric and organic figures.Our results establish Obj3Dify as an effective solution for occlusion-free 3D object reconstruction, advancing 3D modeling for real-world hand-object scenarios with applications in AR, robotics, and virtual content creation.
Loading