Keywords: object-centric learning, semantic segmentation, robust classification
Abstract: Object-centric learning (OCL) aims to learn unsupervised representations that isolate individual objects from their context, motivated by goals such as out-of-distribution (OOD) generalization, compositional generalization, and structured environment modeling. Most prior work has developed slot-based mechanisms for object separation, typically evaluated on unsupervised object discovery.
Recent advances in segmentation provide a scalable alternative: class-agnostic models can separate objects directly in pixel space, enabling independent encoding. We show that such segmentation-based approaches achieve strong zero-shot performance on OOD object discovery benchmarks, scale naturally to foundation models, and flexibly handle a variable number of objects. For the task of object discovery, segmentation therefore offers a practical substitute for slot-based OCL. A broader question is how object separation contributes to downstream goals. We address this in the setting of OOD robustness, focusing on spurious background correlations. We introduce a training-free probe, **Object-Centric Classification with Applied Masks** **(OCCAM)**, and find that segmentation-based encodings of individual objects improve robustness compared to slot-based OCL methods. Our study does not address compositional generalization or reasoning tasks directly, but provides a complementary benchmark where object-centric representations deliver tangible benefits. We release our code and tools to enable the community to explore segmentation-based object-centric representations at scale, and to support practical applications of OCL beyond object discovery.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 2590
Loading