Keywords: Meta-learning; Dynamic regularization; Structural risk; Flat minima.
Abstract: The ideal regularization strategy for deep neural networks should adapt to the local geometry of the loss landscape, since solutions in high-curvature regions are sensitive to perturbations and often generalize poorly. Classic penalties are static and thus may over-regularize in flat regions while under-regularizing in sharp ones. We propose the Structural Risk Network (SRN),a lightweight dynamic regularizer learned by meta-optimization. SRN maps the current model parameters to a state-dependent surrogate $r(\Theta;\phi)$, whose gradient is added to the task gradient at every training step, without per-step inner maximization. The surrogate is meta-aligned to a composite signal that blends two sharpness-related observables---validation-loss sensitivity and the inverse classification margin—providing complementary global and local cues. Under standard smoothness assumptions, a margin–curvature link and a validation–Hessian decomposition explain why this composite target emphasizes low-margin/high-sensitivity neighborhoods, biasing updates away from dominant curvature directions. We assess SRN's effect on curvature via an out-of-loop evaluation of the largest Hessian eigenvalue and observe reduced spikes and lower late-epoch values. In a unified protocol on CIFAR-10/100 with ResNet-8/20/32 (identical backbones, optimizer, epochs, and light augmentations), SRN consistently improves Top-1 accuracy over strong static and dynamic baselines while incurring only moderate overhead, yielding a favorable accuracy–compute trade-off.
Supplementary Material: pdf
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 8041
Loading