A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control

Nikita Kachaev; Daniil Zelezetsky; Alexey Kovalev; Aleksandr Panov

A New Perspective on Transformers in Online Reinforcement Learning for Continuous Control

Nikita Kachaev, Daniil Zelezetsky, Alexey Kovalev, Aleksandr Panov

Published: 28 Feb 2025, Last Modified: 17 Apr 2025WRL@ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: full paper

Keywords: online reinforcement learning, transformers, continuous control, robotics, performance evaluation

TL;DR: We explore transformers for continuous control tasks in online RL, highlight their inherent challenges, and provide guidelines to achieve stable performance

Abstract: Developing transformer-based models in online reinforcement learning (RL) faces a wide range of difficulties such as training instability or suboptimal behavior. In this paper, we find out whether the transformer architecture can be considered as a backbone for RL algorithms. We show that transformers can be trained by classical online RL algorithms without requiring global changes in the training process. Moreover, we explore different transformer architectures and ways to train them. As a result we form a set of recommendations and practical takeaways about how to develop stable approaches of transformer training. We hope that our work will help in understanding the intricacies of configuring transformers for reinforcement learning and will allow to formulate the basic principles of forming a training pipeline for transformer-based architectures.

Supplementary Material: zip

Presenter: ~Nikita_Kachaev1

Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.

Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 68

Loading