Trinity: An Evolved LLM Coordinator

Jinglue Xu; Qi Sun; Peter Schwendeman; Stefan Nielsen; Edoardo Cetin; Yujin Tang

Trinity: An Evolved LLM Coordinator

Jinglue Xu, Qi Sun, Peter Schwendeman, Stefan Nielsen, Edoardo Cetin, Yujin Tang

Published: 26 Jan 2026, Last Modified: 07 May 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: evolutionary strategies, multi-agent LLM systems, role-based delegation, logits-to-agent mapping

TL;DR: A 0.6B coordinator whose ES-evolved 10k-param head reads the penultimate token to pick agents and assign worker/thinker/verifier roles—beating single models/routers without SFT or RL.

Abstract: Combining diverse foundation models is promising, but weight-merging is limited by mismatched architectures and closed APIs. **Trinity** addresses this with a lightweight coordinator that orchestrates collaboration among large language models (LLMs). The coordinator, comprising a compact language model ($\approx 0.6$B parameters) and a lightweight head ($\approx 10$K parameters), is optimized with an evolutionary strategy for efficient and adaptive delegation. **Trinity** processes queries over multiple turns, where at each turn the coordinator assigns one of three roles (*Thinker*, *Worker*, or *Verifier*) to a selected LLM, effectively offloading complex skill acquisition from the coordinator itself. Extensive experiments demonstrate that **Trinity** consistently outperforms individual models and existing methods in various tasks, including coding, math, reasoning, and domain knowledge, while robustly generalizing to out-of-distribution tasks. On established benchmarks, **Trinity** achieves state-of-the-art performance, including a new record of $86.2\%$ on LiveCodeBench. Theoretical and empirical analyses highlight two key factors driving this success: (1) the coordinator’s hidden-state representations provide rich contextualization of inputs, and (2) under high dimensionality and strict budget constraints, the separable Covariance Matrix Adaptation Evolution Strategy algorithm provides substantial advantages over RL, imitation learning, and random search, leveraging potential block-$\varepsilon$-separability.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 1940

Loading