Keywords: Object-centric learning, Probabilistic slot-attention, Identifiability, latent mixture models
TL;DR: We propose a method to learn identifiable object-centric representation up to a proposed equivalence relation.
Abstract: Learning modular object-centric representations is said to be crucial for systematic generalization. Existing methods show promising object-binding capabilities empirically, but theoretical identifiability guarantees remain relatively underdeveloped. Understanding when object-centric representations can theoretically be identified is important for scaling slot-based methods to high-dimensional images with correctness guarantees. To that end, we propose a probabilistic slot-attention algorithm that imposes an *aggregate* mixture prior over object-centric slot representations, thereby providing slot identifiability guarantees without supervision, up to an equivalence relation. We provide empirical verification of our theoretical identifiability result using both simple 2-dimensional data and high-resolution imaging datasets.
Primary Area: Generative models
Submission Number: 17476
Loading