Contrastive Training of Complex-Valued Autoencoders for Object Discovery

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: object-centric learning, complex-valued networks, unsupervised learning, temporal correlation hypothesis
TL;DR: Improvements to the architecture and training of current state-of-the-art synchrony-based model that facilitates it to group multi-object datasets with colour images and simultaenous represent more than three objects.
Abstract: Current state-of-the-art object-centric models use slots and attention-based routing for binding. However, this class of models has several conceptual limitations: the number of slots is hardwired; all slots have equal capacity; training has high computational cost; there are no object-level relational factors within slots. Synchrony-based models in principle can address these limitations by using complex-valued activations which store binding information in their phase components. However, working examples of such synchrony-based models have been developed only very recently, and are still limited to toy grayscale datasets and simultaneous storage of less than three objects in practice. Here we introduce architectural modifications and a novel contrastive learning method that greatly improve the state-of-the-art synchrony-based model. For the first time, we obtain a class of synchrony-based models capable of discovering objects in an unsupervised manner in multi-object color datasets and simultaneously representing more than three objects.
Supplementary Material: pdf
Submission Number: 11618
Loading