Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Yevgen Chebotar; Quan Vuong; Karol Hausman; Fei Xia; Yao Lu; Alex Irpan; Aviral Kumar; Tianhe Yu; Alexander Herzog; Karl Pertsch; Keerthana Gopalakrishnan; Julian Ibarz; Ofir Nachum; Sumedh Anand Sontakke; Grecia Salazar; Huong T Tran; Jodilyn Peralta; Clayton Tan; Deeksha Manjunath; Jaspiar Singh; Brianna Zitkovich; Tomas Jackson; Kanishka Rao; Chelsea Finn; Sergey Levine

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Published: 30 Aug 2023, Last Modified: 20 Apr 2025CoRL 2023 PosterReaders: Everyone

Keywords: Reinforcement Learning, Offline RL, Transformers, Q-Learning, Robotic Manipulation

Abstract: In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We therefore refer to the method as Q-Transformer. By discretizing each action dimension and representing the Q-value of each action dimension as separate tokens, we can apply effective high-capacity sequence modeling techniques for Q-learning. We present several design decisions that enable good performance with offline RL training, and show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite.

Student First Author: no

Supplementary Material: zip

Instructions: I have read the instructions for authors (https://corl2023.org/instructions-for-authors/)

Website: https://qtransformer.github.io

Publication Agreement: pdf

Poster Spotlight Video: mp4

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/q-transformer-scalable-offline-reinforcement/code)

23 Replies

Loading