How Learning Rates Shape Neural Network Focus: Insights from Example Ranking

Ekaterina Lobacheva; Keller Jordan; Aristide Baratin; Nicolas Le Roux

How Learning Rates Shape Neural Network Focus: Insights from Example Ranking

Ekaterina Lobacheva, Keller Jordan, Aristide Baratin, Nicolas Le Roux

Published: 10 Oct 2024, Last Modified: 09 Nov 2024SciForDL PosterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: Learning rate affects which examples are 'easy' and 'hard' for the trained network.

Abstract: The learning rate is a key hyperparameter that affects both the speed of training and the generalization performance of neural networks. Through a new {\it loss-based example ranking} analysis, we show that networks trained with different learning rates focus their capacity on different parts of the data distribution, leading to solutions with different generalization properties. These findings, which hold across architectures and datasets, provide new insights into how learning rates affect model performance and example-level dynamics in neural networks.

Style Files: I have used the style files.

Submission Number: 56

Loading