Visualizing the Loss Landscape of Neural Nets


Nov 07, 2017 (modified: Nov 07, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: As the effectiveness of deep neural networks continues to improve, there remain significant questions about how choices in network architecture, batch size, and parameter initialization impact the network’s trainability and effectiveness. Theoreticians have made significant discoveries, but it is often difficult to translate the assumptions required by theoretical results into meaningful statements about the differences between the neural nets used in practice. Another approach to understanding neural nets is to use visualizations to explore the empirical behavior of loss functions. However, without great care, these visualizations can produce distorted or misleading results. In this paper, we describe a simple approach to visualizing neural network loss functions that provides new insights into the trainability and generalization of neural nets. The technique is used to explore the effect of network architecture, choice of optimizer, and algorithm parameters on loss function minima.