ASCENT: Autonomous Skill Learning Toward Complex Embodied Tasks With Foundation Models

Published: 01 Jan 2025, Last Modified: 10 Oct 2025ICRA 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Collecting data from simulated scenarios for training robotic skills provides a safer and more controllable alternative to real-world environments. However, it demands considerable effort, including the manual construction of simulation environments, the careful design of tasks, and the challenge of obtaining effective trajectories. These limitations hinder the efficiency of data collection from simulated scenarios. In this paper, we leverage the prior knowledge of Large Language Models (LLMs) and Large Multimodal Models (LMMs) to generate simulated scenarios and embodied tasks. We introduce a novel framework, ASCENT (Autonomous Skill learning toward Complex Embodied tasks with fouNdaTion models), designed to efficiently accomplish these tasks and generate trajectory data. ASCENT features a fully autonomous skill learning mechanism based on AI agent. During task training, the AI agent identifies suitable atomic skills from an atomic skill library to either directly complete the task or serve as an initial policy for further training. Newly acquired atomic skills are subsequently added to the library. To address training failures and enhance efficiency, the AI agent uses an LLM to automatically optimize the skill training process based on feedback received from simulations. Experimental results indicate that the number of training steps required for learning new tasks can be reduced by up to 65.9 %.
Loading