Accelerating Transformers in Online RL

Daniil Zelezetsky; Alexey Kovalev; Aleksandr Panov

Accelerating Transformers in Online RL

Daniil Zelezetsky, Alexey Kovalev, Aleksandr Panov

Published: 28 Feb 2025, Last Modified: 17 Apr 2025WRL@ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: full paper

Keywords: Reinforcement Learning, Transformer, Online RL, Offline RL, Robotics

TL;DR: Improving transformers performance in RL

Abstract: The appearance of transformer-based models in Reinforcement Learning (RL) has expanded the horizons of possibilities in robotics tasks, but it has simultaneously brought a wide range of challenges during its implementation, especially in model-free online RL. Some of the existing learning algorithms cannot be easily implemented with transformer-based models due to the instability of the latter. In this paper, we propose a method that uses the Accelerator agent as a transformer's trainer. The Accelerator, a simpler and more stable model, interacts with the environment independently while simultaneously training the transformer through behavior cloning during the first stage of the proposed algorithm. In the second stage, the pretrained transformer starts to interact with the environment in a fully online setting. As a result, this algorithm accelerates the transformer in terms of its performance and helps it to train online more stably.

Supplementary Material: zip

Presenter: ~Daniil_Zelezetsky1

Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.

Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding availability would significantly influence their ability to attend the workshop in person.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 66

Loading