Boss LLM: Adaptation via No-Regret Learning

Published: 08 Mar 2025, Last Modified: 13 Apr 2025SSI-FM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-LLM, No-Regret Learning, Synthetic Data, Self-Improvement
TL;DR: Given a set of base LLM models, we build a Boss LLM that is able to adaptively select mixtures over experts for prompts with possibly intersecting topics.
Abstract: The diversity of Large Language Models (LLMs) calls for more effective strategies to combine their strengths across various tasks. In this work, we learn an adaptive mixture of multiple expert models as Boss LLM. By extending the multi-objective optimization with exponential weights (MOEW) algorithm, Boss LLM selects the most suitable model for a given prompt that could potentially span multiple categories with provable low regret for every category and expert model. Empirical results demonstrate that Boss LLM not only effectively adapts its mixture based upon the categories of a given prompt and improves upon the expert models, but also exhibits generalization properties.
Submission Number: 4
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview