Abstract: We study the impact of different reinitialization methods in several convolutional architectures for small-size image classification datasets. We analyze the potential gains of reinitialization and highlight limitations. We also study a new layerwise reinitialization algorithm that outperforms previous methods and suggest explanations of the observed improved generalization. First, we show that layerwise reinitialization increases the margin on the training examples without increasing the norm of the weights, hence leading to an improvement in margin-based generalization bounds for neural networks. Second, we demonstrate that it settles in flatter local minima of the loss surface. Third, it encourages learning general rules and discourages memorization by placing emphasis on the lower layers of the neural network.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: - Included experiments using CIFAR10 and CIFAR100 and various choices of hyper-parameters to Appendix E.
- Added standard errors to Table 2.
- Moved the theoretical analysis to the appendix to improve the flow of the paper.
- Included complete ablation study results.
- Fixed typos.
Assigned Action Editor: ~Guido_Montufar1
Submission Number: 26
Loading