Abstract: Understanding the world as a collection of interacting objects is an important cognitive ability. However, learning such a structured world model, which captures individual entities and their relationships without supervision, remains a challenging and underexplored problem in a reinforcement learning (RL) setting. To address this challenge, we propose Object-Centric Dreamer, a model-based algorithm that implements an inductive bias for object-centric representations of dynamical systems. In our approach, object-centric representations are extracted from high-dimensional visual observations using an encoder based on the Slot-Attention module. We first introduce the Object-Centric Recurrent State-Space Model (OCRSSM), which employs a graph neural network (GNN) to predict the dynamics of a given environment. Next, we integrate OCRSSM with a GNN-based actor-critic policy trained on imagined object-centric trajectories in latent space. Evaluating our approach on object-centric tasks across environments with diverse dynamics, we show that the object-centric world model enables the agent to solve tasks more efficiently.
External IDs:dblp:conf/icann/UgadiarovVP25
Loading