Truth or backpropaganda? An empirical investigation of deep learning theory

Micah Goldblum; Jonas Geiping; Avi Schwarzschild; Michael Moeller; Tom Goldstein

Truth or backpropaganda? An empirical investigation of deep learning theory

Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein

Published: 20 Dec 2019, Last Modified: 22 Jun 2025ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting.

Keywords: Deep learning, generalization, loss landscape, robustness

TL;DR: We call into question commonly held beliefs regarding the loss landscape, optimization, network width, and rank.

Code: https://github.com/goldblum/TruthOrBackpropaganda

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/truth-or-backpropaganda-an-empirical/code)

Original Pdf: pdf

15 Replies

Loading