Contrastive Learning as Goal-Conditioned Reinforcement Learning

Benjamin Eysenbach; Tianjun Zhang; Sergey Levine; Ruslan Salakhutdinov

Contrastive Learning as Goal-Conditioned Reinforcement Learning

Benjamin Eysenbach, Tianjun Zhang, Sergey Levine, Ruslan Salakhutdinov

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: contrastive learning, reinforcement learning, goal-conditioned, representation learning

TL;DR: Contrastive learning can be turned into a goal-conditioned RL algorithm

Abstract: In reinforcement learning (RL), it is easier to solve a task if given a good representation. While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable and instead equip RL algorithms with additional representation learning parts (e.g., auxiliary losses, data augmentation). How can we design RL algorithms that directly acquire good representations? In this paper, instead of adding representation learning parts to an existing RL algorithm, we show (contrastive) representation learning methods are already RL algorithms in their own right. To do this, we build upon prior work and apply contrastive representation learning to action-labeled trajectories, in such a way that the (inner product of) learned representations exactly corresponds to a goal-conditioned value function. We use this idea to reinterpret a prior RL method as performing contrastive learning, and then use the idea to propose a much simpler method that achieves similar performance. Across a range of goal-conditioned RL tasks, we demonstrate that contrastive RL methods achieve higher success rates than prior non-contrastive methods. We also show that contrastive RL outperforms prior methods on image-based tasks, without using data augmentation or auxiliary objectives

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/contrastive-learning-as-goal-conditioned/code)

11 Replies

Loading