Keywords: Distributed Q-Learning, Transformer, Scalability, Noise Resilience, Reinforcement Learning, Multi-Agent Systems, Robustness, Deep RL
TL;DR: This paper investigates the scalability and noise resilience of the Q-Transformer in reinforcement learning, reveals its limitations, proposes a median-based reward smoothing technique for improvement, and suggests future research directions.
Abstract: The integration of deep reinforcement learning with Transformer architectures [Vaswani et al., 2017] has demonstrated potential in addressing complex tasks under partial observability. However, challenges related to scalability and training efficiency remain. This paper investigates scalability through the Distributed Q-Transformer (DQT) framework, which combines the long-range dependency modeling of Transformers with the sample efficiency of Q-learning. By leveraging distributed training, DQT enables multiple agents to explore diverse trajectories in parallel, accelerating convergence in model-based environments.
Beyond scalability, we investigate how noise impacts the performance of Q-Transformer [qtr, 2023] and propose robust estimation techniques, including median-based filtering, to enhance its resilience. We systematically analyze the effects of noise injection and evaluate the robustness of the Q-Transformer within the MetaWorld benchmark [Yu and others, 2020]. Furthermore, we examine how different learning rate strategies influence performance under noisy conditions.
This work contributes to bridging the gap between Transformer-based policy representation and distributed reinforcement learning while addressing scalability and noise robustness, providing a viable solution for real-world robotic applications.
Submission Number: 4
Loading