Unifying Causal and Object-centric Representation Learning allows Causal Composition

Published: 06 Mar 2025, Last Modified: 15 Apr 2025ICLR 2025 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: object-centric learning, causal representation learning, composition, scene graphs
TL;DR: We provide a unifying perspective on causal representation and object centric learning, grounded in causal abstractions for scene graph modelling.
Abstract: The goal of Object-Centric Learning (OCL) is to enable machine learning systems to decompose complex scenes into discrete, interacting objects, supporting compositional generalization and human-like reasoning. However, existing OCL methods often fail to capture interactions from both attribute-level (semantic) and object-level (spatial) perspectives. While scene graph methods complement OCL by abstracting scenes as structured graphs, they typically rely on supervision. This position paper argues for a probabilistic perspective on Scene Graph Modelling (SGM), grounded in causal abstraction as a unifying view on causality, OCL, and scene graphs by considering object interactions as invariant mechanisms within object-level graphs, enabling us to generate causally consistent scene compositions. We substantiate our position with thorough conceptual discussion, rigorous definitions, conjectures, and examples, demonstrating how this perspective bridges the gap between unsupervised object discovery and explicit scene graph reasoning.
Submission Number: 32
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview