Variational Agent Discovery

Aneesh Muppidi; Samuel J. Gershman; Wilka Carvalho

Variational Agent Discovery

Aneesh Muppidi, Samuel J. Gershman, Wilka Carvalho

18 Sept 2025 (modified: 18 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: unsupervised visual learning, variational inference, multi-agent RL, representation learning, object-centric learning, imitation learning from observation, slot attention

TL;DR: We learn agent-centric representations from pixels via variational inference over latent actions, achieving strong generalization across novel agents/goals/environments and emergent cognitive properties without supervision.

Abstract: We introduce Variational Agent Discovery (VAD), an unsupervised agent representation learning algorithm that discovers agent-centric representations directly from pixels. We frame agent representation learning as a prediction problem where we aim to predict what latent actions describe transitions of latent variables used to model a scene. VAD leverages slot-based attention with a variational objective that jointly learns inverse dynamics (inferring actions from transitions), forward dynamics (predicting states from actions), and agent policies (distributions over actions). Without any supervision, VAD develops representations that generalize to novel agents and goals with minimal performance degradation. Our learned representations enable downstream tasks like action prediction and goal inference. Notably, VAD exhibits shared action representations across multiple observed agents—feature dimensions that consistently activate for the same action regardless of which agent performs it—and demonstrates teleological reasoning capabilities similar to 12-month-old infants, suggesting that these cognitive phenomena can emerge from our unsupervised agent representation learning objective.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 10823

Loading