TL;DR: We reveal that Slot Attention induces factor-wise homogeneous representations that offers significant advantages for continual object-centric learning.
Abstract: While Object-Centric Learning has shown great promise in modular perception, its extension to Continual Learning remains underexplored.
In this work, we observe that Slot Attention exhibits a distinctive behavior:
it organizes latent representations into small and separated regions, each of which preserves identical factor states, crucially emerging not only in the current task but also across sequential tasks with novel factors. This *inter-task separation* offers significant advantages in continual learning, which typically suffers from severe object-wise forgetting.
We refer to this phenomenon as *Factor-Wise Homogeneity*, and show that this intrinsic inter-task separation is crucial, serving as a key mechanism to prevent catastrophic forgetting in Continual Object-Centric Learning.
However, despite its strong robustness, factor-wise homogeneity alone is insufficient due to the bottleneck in exploiting this separation at the decoder.
To overcome this limitation and demonstrate the significance of our findings, we show that a minimal strategy *Decoder-only Post-Replay*, which freezes the factor-wise homogeneous representations and employs decoder-only fine-tuning, is sufficient.
This work serves as a fundamental basis for understanding and leveraging the intrinsic dynamics of Slot Attention, offering essential insights for advancing object-centric systems.
Lay Summary: While Object-Centric Learning (OCL) has shown great promise in modular perception, its extension to Continual Learning remains underexplored. In this setting, continual learning typically suffers from severe object-wise catastrophic forgetting and destructive interference between sequential tasks.
In this work, we identify the intrinsic Factor-Wise Homogeneity property of Slot Attention, which organizes latent representations into separated manifolds that preserve identical factor states across sequential tasks with novel factors. However, we identify a decoding bottleneck in exploiting this robust latent separation. To overcome this limitation, we show that a minimal Decoder-only Post-Replay (DPR) strategy—which freezes the factor-wise homogeneous representations and employs decoder-only fine-tuning—is sufficient to overcome the destructive interference.
This intrinsic inter-task separation serves as a primary mechanism for mitigating catastrophic forgetting in Continual Object-Centric Learning (COCL). Ultimately, this work serves as a fundamental basis for understanding and leveraging the intrinsic dynamics of OCL representations in sequential settings.
Originally Submitted Supplementary Material: zip
Link To Code: https://github.com/GIST-IRR/FWH_COCL.git
Primary Area: Deep Learning->Other Representation Learning
Keywords: Object Centric Learning, Representation Learning, Continual Learning
Originally Submitted PDF: pdf
Submission Number: 11750
Loading