TL;DR: Learning rate affects which examples are 'easy' and 'hard' for the trained network.
Abstract: The learning rate is a key hyperparameter that affects both the speed of training and the generalization performance of neural networks.
Through a new {\it loss-based example ranking} analysis, we show that networks trained with different learning rates focus their capacity on different parts of the data distribution, leading to solutions with different generalization properties. These findings, which hold across architectures and datasets, provide new insights into how learning rates affect model performance and example-level dynamics in neural networks.
Style Files: I have used the style files.
Submission Number: 56
Loading