AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning

Published: 15 Oct 2024, Last Modified: 15 Oct 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this paper we investigate transformer architectures designed for partially observable online reinforcement learning. The self-attention mechanism in the transformer architecture is capable of capturing long-range dependencies and it is the main reason behind its effectiveness in processing sequential data. Nevertheless, despite their success, transformers have two significant drawbacks that still limit their applicability in online reinforcement learning: (1) in order to remember all past information, the self-attention mechanism requires access to the whole history to be provided as context. (2) The inference cost in transformers is expensive. In this paper, we introduce recurrent alternatives to the transformer self-attention mechanism that offer context-independent inference cost, leverage long-range dependencies effectively, and performs well in online reinforcement learning task. We quantify the impact of the different components of our architecture in a diagnostic environment and assess performance gains in 2D and 3D pixel-based partially-observable environments (e.g. T-Maze, Mystery Path, Craftax, and Memory Maze). Compared with a state-of-the-art architecture, GTrXL, inference in our approach is at least 40% cheaper while reducing memory use more than 50%. Our approach either performs similarly or better than GTrXL, improving more than 37% upon GTrXL performance in harder tasks.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: - Introduction and abstract makes it more explicit that the proposed work is explored in the context of RL. - Modified the name of the architecture. - Expanded the related work to include suggestions by E4UB. - Fix the missing text classification results in main text. - Add the recent crafter results. - Discussion on offline RL approaches. - modify paper title.
Video: https://www.youtube.com/watch?v=-bTe48JIUds
Code: https://github.com/subho406/agalite
Supplementary Material: zip
Assigned Action Editor: ~Blake_Aaron_Richards1
Submission Number: 3069
Loading