Rethinking Object-Centric Representations in the Era of Foundational Segmentation Models

Published: 06 Mar 2025, Last Modified: 06 Mar 2025SCSL @ ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Track: regular paper (up to 6 pages)
Keywords: object-centric representation, foundational segmentation models, spurious backgrounds correlation, robust classification
Abstract: Object-centric learning (OCL) aims to represent each object's information independently and minimize interference from backgrounds and other objects. OCL is expected to aid model generalization, especially in out-of-distribution (OOD) settings. However, the community's effort has been focused on improving unsupervised entity segmentation performances which is secondary to the main objective. We challenge this. We argue that segmentation is no longer the main barrier: recent class-agnostic segmentation methods reliably localize objects in a zero-shot manner. Instead, we advocate for a renewed emphasis on how decomposed representations can improve OOD generalization. As a first step, we propose Object-Centric Classification with Applied Masks (OCCAM) that exploits discovered objects to extract their representations for downstream classification tasks. Our experiments on datasets with background spurious correlations suggest that even in this task OCL representations do not lead to better generalization than object-centric representations provided by foundational segmentation models. These results showcase the importance of recognizing advances in zero-shot image segmentation when high-performant object-centric representations are the end goal. In addition to that, we suggest exploring new benchmarks for OCL methods evaluation that better reflect the problems these methods are designed to solve and highlight scenarios where OCL methods are more favorable solutions than foundational segmentation models.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Presenter: ~Alexander_Rubinstein1
Submission Number: 60
Loading