Abstract: While Large Language Models (LLMs) have shown great promise as agents in interactive tasks, their high computational costs limit their utility, especially for long-horizon tasks. We propose a method for transferring the performance of an LLM with billions of parameters to a much smaller language model (770M parameters). Specifically, we develop a hierarchical agent composed of a planning module that learns via Knowledge Distillation from an LLM to generate sub-goals and an execution module that learns to achieve sub-goals with elementary actions. Because neither module relies on online access to an LLM at inference, our method has a fixed cost of LLM interactions all happening during training. In ScienceWorld -- a challenging interactive text environment -- our approach outperforms standard imitation learning on elementary actions alone by 16.7% (absolute). Our analysis underscores our method's efficiency with respect to other LLM-based methods. We release our code and data for distillation at anon_url.com.
Paper Type: long
Research Area: Dialogue and Interactive Systems
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models
Languages Studied: English
0 Replies
Loading