Competition Priors for Object-Centric Learning

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Object-centric, Object-centric Learning, Representation Learning, Abstraction
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A non-iterative object-centric learning method using common building blocks, namely CNN, MaxPool and a modified Cross-Attention layer.
Abstract: Humans are very good at abstracting from data and constructing concepts that are then reused. This is missing in current learning systems. The field of object-centric learning tries to bridge this gap by learning abstract representations, often called slots, from data without human supervision. Different methods have been proposed to tackle this task for images, whereas most are overly complex, non-differentiable, or poorly scalable. In this paper, we introduce a conceptually simple, fully-differentiable, non-iterative, and scalable method called **COP** (**C**ompetition **O**ver **P**ixel features). It is implementable using only Convolution and MaxPool layers and an Attention layer. Our method encodes the input image with a convolutional neural network and then uses a branch of alternating convolution and MaxPool layers to create competition and extract primitive slots. These primitive slots are then used as queries for a variant of Cross-Attention over the encoded image. Despite its simplicity, our method is competitive or outperforms previous methods on standard benchmarks. The code is publicly available.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7911
Loading