Towards Dynamic Interleaving Optimizers

Published: 26 Jan 2026, Last Modified: 11 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: HPO, optimizer
Abstract: Optimizers are critical for training deep neural networks. Existing training processes rely on a single static optimizer (e.g., SGD) or a simple hybrid of two optimizers, which miss the opportunity to exploit evolving dynamics in different training states, degrading model quality and convergence. In this paper, we propose a novel dynamic optimizer switching method called **D**ynamic **O**ptimizer **I**nterleaving **T**raining (DOIT) method, which builds surrogate models to predict different optimizers' performance from current parameter states. DOIT uses an acquisition function that combines the results from surrogate models with transferability assessments and process information to select a suitable optimizer for the subsequent training. Experiments on various models and tasks (e.g., image and text classification, machine translation, and object detection) show that DOIT effectively enhances the training, achieving faster convergence (i.e., 2\% to 10\% faster) and higher accuracy (i.e., 1\% to 3\% improvement). Additional independent experiments and case studies further validate DOIT's effectiveness.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 3551
Loading