Abstract: We introduce \textbf{T}une \textbf{w}ithout Validat\textbf{i}o\textbf{n} (Twin), a simple and effective pipeline for tuning learning rate and weight decay of homogeneous classifiers without validation sets, eliminating the need to hold out data and avoiding the two-step process.
Twin leverages the margin-maximization dynamics of homogeneous networks and an empirical bias–variance scaling law that links training and test losses across hyper-parameter configurations.
This mathematical modeling yields a regime-dependent, validation-free selection rule: in the \emph{non-separable} regime, training loss is monotonic in test loss and therefore predictive of generalization, whereas in the \emph{separable} regime, the parameter norm becomes a reliable indicator of generalization due to margin maximization.
Across 37 dataset-architecture configurations for image classification, we demonstrate that Twin achieves a mean absolute error of 1.28\% compared to an \textit{Oracle} baseline that selects HPs using test accuracy.
We demonstrate Twin’s benefits in scenarios where validation data may be scarce, such as small-data regimes, or difficult and costly to collect, as in medical imaging tasks.
We plan to release our code.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Ikko_Yamane1
Submission Number: 7079
Loading