Robot learning in the era of foundation models: a survey

Xuan Xiao, Jiahang Liu, Zhipeng Wang, Yanmin Zhou, Yong Qi, Shuo Jiang, Bin He, Qian Cheng

Published: 2025, Last Modified: 14 May 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The proliferation of large language models (LLMs) has fueled a shift in robot learning from automation towards general embodied artificial intelligence (AI). Adopting foundation models together with traditional learning methods for robot learning has increasingly gained interest in the research community and shown potential for real-life application. However, there is little literature that comprehensively reviews the relatively new technologies combined with robotics. The purpose of this review is to systematically assess the state-of-the-art foundation models in robot learning and to identify future potential areas. Specifically, we first summarized the technical evolution of robot learning and identified the necessary preliminary preparations for foundation models, including the simulators, datasets, and foundation model framework. In addition, we focused on the following four mainstream areas of robot learning, including manipulation, navigation, task planning, and reasoning, and demonstrated how the foundation model can be adopted in the above scenarios. Furthermore, critical issues that are neglected in the current literature, including robot hardware and software decoupling, dynamic data, generalization performance in the presence of humans, etc., were discussed. This review highlights the state-of-the-art progress of foundation models in robot learning. Future research should focus on multimodal interaction, especially dynamics data, robotics-specific foundation models, AI alignment, etc.