Rethinking Predictive LLM Routing: When Simple KNN Beats Complex Learned Routers

Rethinking Predictive LLM Routing: When Simple KNN Beats Complex Learned Routers

ICLR 2026 Conference Submission21199 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Routing

Abstract: As large language models (LLMs) grow in scale and specialization, routing—selecting the best model for a given input—has become essential for efficient and effective deployment. While recent methods rely on increasingly complex learned routing strategies, their dependence on disparate training data and evaluation setups makes comparison and generalization difficult. In this work, we fundamentally rethink LLM routing by questioning whether such complexity is necessary. We show that a well-tuned k-Nearest Neighbors (kNN) approach not only matches but often outperforms state-of-the-art learned routers while being significantly more efficient. To support systematic evaluation, we introduce a suite of standardized routing benchmarks spanning instruction-following, question-answering, and reasoning tasks, as well as the first multi-modal routing dataset involving visual inputs. Our theoretical analysis reveals that the strong locality properties of model performance in embedding space enable simple non-parametric methods to achieve superior routing decisions with lower sample complexity than parametric approaches. These findings challenge the prevailing trend toward sophisticated architectures and demonstrate that simple, interpretable approaches can be surprisingly effective for LLM routing.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 21199

Loading