Twin: Tuning Learning Rate and Weight Decay of Deep Homogeneous Classifiers without Validation

TMLR Paper7079 Authors

20 Jan 2026 (modified: 11 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We introduce \textbf{T}une \textbf{w}ithout Validat\textbf{i}o\textbf{n} (Twin), a simple and effective pipeline for tuning learning rate and weight decay of homogeneous classifiers without validation sets, eliminating the need to hold out data and avoiding the two-step process. Twin leverages the margin-maximization dynamics of homogeneous networks and an empirical bias–variance scaling law that links training and test losses across hyper-parameter configurations. This mathematical modeling yields a regime-dependent, validation-free selection rule: in the \emph{non-separable} regime, training loss is monotonic in test loss and therefore predictive of generalization, whereas in the \emph{separable} regime, the parameter norm becomes a reliable indicator of generalization due to margin maximization. Across 37 dataset-architecture configurations for image classification, we demonstrate that Twin achieves a mean absolute error of 1.28\% compared to an \textit{Oracle} baseline that selects HPs using test accuracy. We demonstrate Twin’s benefits in scenarios where validation data may be scarce, such as small-data regimes, or difficult and costly to collect, as in medical imaging tasks. We plan to release our code.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Ikko_Yamane1
Submission Number: 7079
Loading