From Slots to Masks: Rethinking OCL

Alexander Rubinstein; Ameya Prabhu; Matthias Bethge; Seong Joon Oh

From Slots to Masks: Rethinking OCL

Alexander Rubinstein, Ameya Prabhu, Matthias Bethge, Seong Joon Oh

06 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: object-centric learning, semantic segmentation, robust classification

Abstract: Object-centric learning (OCL) aims to learn unsupervised representations that isolate individual objects from their context, motivated by goals such as out-of-distribution (OOD) generalization, compositional generalization, and structured environment modeling. Most prior work has developed slot-based mechanisms for object separation, typically evaluated on unsupervised object discovery. Recent advances in segmentation provide a scalable alternative: class-agnostic models can separate objects directly in pixel space, enabling independent encoding. We show that such segmentation-based approaches achieve strong zero-shot performance on OOD object discovery benchmarks, scale naturally to foundation models, and flexibly handle a variable number of objects. For the task of object discovery, segmentation therefore offers a practical substitute for slot-based OCL. A broader question is how object separation contributes to downstream goals. We address this in the setting of OOD robustness, focusing on spurious background correlations. We introduce a training-free probe, **Object-Centric Classification with Applied Masks** **(OCCAM)**, and find that segmentation-based encodings of individual objects improve robustness compared to slot-based OCL methods. Our study does not address compositional generalization or reasoning tasks directly, but provides a complementary benchmark where object-centric representations deliver tangible benefits. We release our code and tools to enable the community to explore segmentation-based object-centric representations at scale, and to support practical applications of OCL beyond object discovery.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 2590

Loading