When Do Language Models Need to Be Large?

Zhixun Chen; Yali Du; David Henry Mguni

When Do Language Models Need to Be Large?

Zhixun Chen, Yali Du, David Henry Mguni

Published: 03 Jul 2024, Last Modified: 12 Jul 2024ICML 2024 FM-Wild Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, optimal switching, budget

Abstract: Many leading language models (LMs) use high-intensity computational resources both during training and execution. This poses the challenge of lowering resource costs for deployment and faster execution in decision-making tasks among others. We introduce a novel plug \& play LM framework named Language OptimisingNetwork Distribution (LONDI). LONDI learns to selectively employ large LMs only where complex decision-making and reasoning are required while using low-resource LMs (i.e. LMs require less GPU usage, but may not be able to solve the problem alone) everywhere else. LONDI consists of a system of two (off-)policy networks, an LM, a large LM (LLM), and a reinforcement learning module that uses switching controls to quickly learn in which system states to call the LLM. We then introduce a variant of LONDI that maintains budget constraints on LLM calls and hence its resource usage. We test LONDI's performance in a range of tasks in ScienceWorld and BabyAI-Text and demonstrate that LONDI can solve tasks only solvable by resource-intensive LLMs while reducing GPU usage by up to 30\%.

Submission Number: 23

Loading