Amodal SAM: Open-World Amodal Segmentation

Bo Zhang; Zhuotao Tian; Xin Tao; Songlin Tang; Guangming Lu; Jun Yu; Wenjie Pei

Amodal SAM: Open-World Amodal Segmentation

Bo Zhang, Zhuotao Tian, Xin Tao, Songlin Tang, Guangming Lu, Jun Yu, Wenjie Pei

17 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Amodal segmentation, SAM, Open world

Abstract: Amodal segmentation, which aims to predict complete object shapes including occluded regions, remains challenging in open-world scenarios where models must generalize to novel objects and contexts. While the Segment Anything Model (SAM) has demonstrated remarkable zero-shot generalization capabilities, it is fundamentally limited to visible region segmentation. This paper presents Amodal SAM, a framework that extends SAM's capabilities to amodal segmentation while preserving its powerful generalization ability. The improvements lie in three aspects: (1) a lightweight Spatial Completion Adapter that enables occluded region reconstruction, (2) a Target-Aware Occlusion Synthesis (TAOS) pipeline that addresses the scarcity of amodal annotations by generating diverse synthetic training data, and (3) novel learning objectives that enforce regional consistency and topological regularization. Extensive experiments demonstrate that Amodal SAM achieves state-of-the-art performance on standard benchmarks while exhibiting strong generalization to novel scenarios. Furthermore, our framework seamlessly extends to video sequences, as the first attempt to tackle the open-world video amodal segmentation. We hope our research can advance the field toward practical amodal segmentation systems that can operate effectively in unconstrained real-world environments. Code and models will be made publicly available.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 8671

Loading