Actions Speak Louder Than States: Going Beyond Bayesian Inference in In-Context Reinforcement Learning

ICLR 2025 Conference Submission12332 Authors

27 Sept 2024 (modified: 27 Nov 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: meta-reinforcement learning, in-context learning, decision-making
TL;DR: Investigating the factors that lead to in-context reinforcement learning
Abstract: In this paper, we investigate in-context learning (ICL) for reinforcement learning (RL), particularly extending beyond Bayesian inference to more advanced and richer learning paradigms in transformers. Transformers have shown promise for few-shot and zero-shot learning, but their capabilities for ICL in RL environments are not well explored. Our work studies the role of task diversity in RL environments on the downstream ICL capabilities of transformers. To do so, we introduce a novel RL benchmark, developed to provide a rich variety of tasks, essential for this exploration. Through this environment, we not only demonstrate the critical role of task diversity in facilitating advanced learning algorithms like transformers but also investigate the effects of model architecture, regularization, and other factors on the learning process. This study marks a pivotal advance in understanding the dynamics of ICL in RL, showcasing how diverse tasks can drive transformer models to surpass traditional learning methods.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 12332
Loading