Deep $k$-NN Label Smoothing Improves Reproducibility of Neural Network Predictions

Dara Bahri; Heinrich Jiang

Deep $k$-NN Label Smoothing Improves Reproducibility of Neural Network Predictions

Dara Bahri, Heinrich Jiang

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: $k$-nearest neighbors, neural networks, label smoothing, churn, reproducibility, stability, robustness

Abstract: Training modern neural networks is an inherently noisy process that can lead to high \emph{prediction churn}-- disagreements between re-trainings of the same model due to factors such as randomization in the parameter initialization and mini-batches-- even when the trained models all attain high accuracies. Such prediction churn can be very undesirable in practice. In this paper, we present several baselines for reducing churn and show that utilizing the $k$-NN predictions to smooth the labels results in a new and principled method that often outperforms the baselines on churn while improving accuracy on a variety of benchmark classification tasks and model architectures.

One-sentence Summary: Label smoothing by using Deep $k$-NN estimates improves the reproducibility of neural network training.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=9YwhK-x9hf

10 Replies

Loading