Skill Discovery using Language Models

Wensen Mao; Wenjie Qiu; Yuanlin Duan; He Zhu

Skill Discovery using Language Models

Wensen Mao, Wenjie Qiu, Yuanlin Duan, He Zhu

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement learning, Large language models, Robotics

Abstract: Large Language models (LLMs) possess remarkable ability to understand natural language descriptions of complex robotics environments. Earlier studies have shown that LLM agents can use a predefined set of skills for robot planning in long-horizon tasks. However, the requirement for prior knowledge of the skill set required for a given task constrains its applicability and flexibility. We present a novel approach L2S (short of Language2Skills) to leverage the generalization capabilities of LLMs to decompose the natural language task description of a complex task to definitions of reusable skills. Each skill is defined by an LLM-generated dense reward function and a termination condition, which in turn lead to effective skill policy training and chaining for task execution. To address the uncertainty surrounding the parameters used by the LLM agent in the generated reward and termination functions, L2S trains parameter-conditioned skill policies that performs well across a broad spectrum of parameter values. As the impact of these parameters for one skill on the overall task becomes apparent only when its following skills are trained, L2S selects the most suitable parameter value during the training of the subsequent skills to effectively mitigate the risk associated with incorrect parameter choices. During training, L2S autonomously accumulates a skill library from continuously presented tasks and their descriptions, leveraging guidance from the LLM agent to effectively apply this skill library in tackling novel tasks. Our experimental results show that L2S is capable of generating reusable skills to solve a wide range of robot manipulation tasks.

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8541

Loading