Meta-Learning and Meta-Reinforcement Learning - Tracing the Path towards Deep Mind's Adaptive Agent

TMLR Paper5364 Authors

12 Jul 2025 (modified: 09 Oct 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability standard machine learning models struggle to replicate due to their reliance on task-specific training. Meta-learning overcomes this limitation by allowing models to acquire transferable knowledge from various tasks, enabling rapid adaptation to new challenges with minimal data. This survey provides a rigorous, task‑based formalization of meta‑learning and meta-reinforcement learning and uses that paradigm to chronicle the landmark algorithms that paved the way to DeepMind’s Adaptive Agent, consolidating the essential concepts needed to understand the Adaptive Agent and other generalist approaches. It discusses the relevance of meta‑learning and meta‑RL in the era of scaling foundation models and generalist agents and outlines open problems and future directions.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: - Abstract and introduction are sharpened and shortened by one page. - The meta-learning paradigm is almost three pages shorter, now. This is mainly due to a much shorter introduction of standard learning. - We slightly shortened the meta-RL paradigm. Both, the introduction of standard learning and standard RL are now focusses on the respective examples, with minimal additional formalism. - We moved the Terminology section to the Appendix. - The figures in the timeline section are condensed into one single figure showing a timeline of landmark developements for a better overview at the beginning of the Timeline section. Additionally, two tables highlight advantages and disadvantages of the gradient-based and the memory-based landmarks respectively. These tables include a suggestion, for which problems the respective algorithms should be used. - For better comparison, we added a Table in the Discussion Section. This table lists the different components that are presented along with the different landmarks in the Timeline section. We find this table to add on to the other tables in the Timeline section since they only compare single algorithms to each other.
Assigned Action Editor: ~Tim_Genewein1
Submission Number: 5364
Loading