Open-set Hierarchical Semantic Segmentation for 3D Scene

Diwen Wan; Jiaxiang Tang; Jingbo Wang; Xiaokang Chen; Lingyun Gan; Gang Zeng

Open-set Hierarchical Semantic Segmentation for 3D Scene

Diwen Wan, Jiaxiang Tang, Jingbo Wang, Xiaokang Chen, Lingyun Gan, Gang Zeng

Published: 01 Jan 2024, Last Modified: 20 Jul 2025ICME 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The Segment-Anything Model (SAM) shows exceptional zero-shot capabilities for 2D images. Developing a similar model for 3D, however, is challenging due to limited datasets. In this paper, we introduce a zero-shot algorithm to segment a 3D scene into elements at various levels of detail, and further organize the results in a hierarchical tree structure. We propose a tree quality metric to evaluate the algorithm’s performance. Notably, our algorithm eliminates the need for 3D annotations. It uses robust 2D models to generate a 2D segmentation tree for each rendered image. Then, using graph neural networks, it aggregates these 2D trees to form a unified 3D segmentation tree. Extensive experiments on the PartNet dataset and complex 3D scenes validate the algorithm’s effectiveness. We release the source code at https://github.com/dnvtmf/OTS.

Loading