A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat MinimaDownload PDF

28 Sept 2020, 15:52 (edited 20 Apr 2021)ICLR 2021 PosterReaders: Everyone
Keywords:
Abstract:
One-sentence Summary:
Code Of Ethics:
11 Replies

Loading