The Timeline of Meta-Reinforcement Learning - From the beginnings to the Adaptive Agent

TMLR Paper5364 Authors

12 Jul 2025 (modified: 26 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability standard machine learning models struggle to replicate due to their reliance on task-specific training. Meta-learning or ‘learning to learn’ overcomes this limitation by allowing models to acquire transferable knowledge from various tasks, enabling rapid adaptation to new challenges with minimal data. This survey presents a clear mathematical paradigm of meta-learning together with a formalization of common performance measures, distinguishes it from transfer-learning and multi-task learning, and utilizes it to derive the meta-reinforcement learning paradigm. A timeline of landmark meta-reinforcement learning developments from the earliest successes MAML and RL2 to the Adaptive Agent is provided along with the corresponding paradigms and training schemes. This way, this work offers a comprehensive foundation for understanding meta-learning and meta-reinforcement learning, before giving an outlook on the latest developments and the connection of meta-learning to the path towards general intelligence.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Tim_Genewein1
Submission Number: 5364
Loading