Probabilistic World Modeling with Asymmetric Distance Measure

Published: 17 Jun 2024, Last Modified: 10 Jul 2024ICML 2024 Workshop GRaMEveryoneRevisionsBibTeXCC BY 4.0
Track: Extended abstract
Keywords: self-supervised learning, contrastive learning, Markov chain, subgoal discovery
TL;DR: We proposed a contrastive learning method to learn a representation space that preserves the geometric abstraction of the directed transition graph of a Markov chain by encoding state reachability into an asymmetric similarity function.
Abstract: Representation learning is a fundamental task in machine learning, aiming at uncovering structures from data to facilitate subsequent tasks. However, what is a good representation for planning and reasoning in a stochastic world remains an open problem. In this work, we posit that learning a distance function is essential to allow planning and reasoning in the representation space. We show that a geometric abstraction of the probabilistic world dynamics can be embedded into the representation space through asymmetric contrastive learning. Unlike previous approaches that focus on learning single-step mutual similarity or compatibility measures, we instead learn an asymmetric similarity function that allows irreversible state reachability and multi-way probabilistic inference. Moreover, by conditioning on a common reference state (e.g. the observer's current state), the learned representation space allows us to discover the geometrically salient states that only a handful of paths can lead through. These states can naturally serve as subgoals to break down long-horizon planning tasks. We evaluate our method in gridworld environments with various layouts and demonstrate its effectiveness in discovering the subgoals.
Submission Number: 2
Loading