Trail Mix: Adaptive Interpolation of Optimizers with Convergence Guarantees

Isaac-Neil Zanoria; Anupama Sridhar; Alexander Rosenberg Johansen

Trail Mix: Adaptive Interpolation of Optimizers with Convergence Guarantees

Isaac-Neil Zanoria, Anupama Sridhar, Alexander Rosenberg Johansen

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Optimizers, interpolation, convergence

TL;DR: Trail Mix is a convex framework that provably preserves convergence rates while adaptively interpolating a wide range of optimizers, acting like an ensemble when they are complementary and collapsing onto the best one when it dominates.

Abstract: Optimizers are central to modern deep learning, yet no single algorithm consistently excels across architectures or datasets. Existing methods of adaptively mixing optimizers to combine complementary strengths are promising, but are restricted to narrow optimizer families or lack rigorous guarantees, leaving a gap between theory and practice. To fill this gap, we present TrailMix, an adaptive interpolation framework that is general across all first- and quasi-second-order methods. On the theoretical front, we prove that convex combinations of optimizers satisfying a mild alignment condition preserve standard convergence rates in non-convex, convex, and strongly convex or PL regimes. For the challenging same-timescale setting, we establish a novel analysis method by lifting the stochastic dynamics to a population-level Fokker-Planck PDE, for which we prove stability using a joint free-energy Lyapunov function. Algorithmically, we extend this framework with fairness normalization, trust-region clipping, and a curvature-awareness reward that stabilizes the meta-weights and enables smoother training. These additions allow TrailMix to behave like an ensemble when optimizers are complementary and to concentrate weight when one dominates, without breaking convexity. Our empirical evaluations on an optimizer set including AdamW, Lion, SOAP, Scion, and MARS show that TrailMix consistently matches or outperforms the strongest single optimizer across a wide range of analytic loss surfaces.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 23738

Loading