2022 (modified: 19 Nov 2022)ICML 2022Readers: Everyone
Abstract:Weight norm $\|w\|$ and margin $\gamma$ participate in learning theory via the normalized margin $\gamma/\|w\|$. Since standard neural net optimizers do not control normalized margin, it is hard to...