How gradient estimator variance and bias impact learning in neural networks

Arna Ghosh; Yuhan Helena Liu; Guillaume Lajoie; Konrad Kording; Blake Aaron Richards

How gradient estimator variance and bias impact learning in neural networks

Arna Ghosh, Yuhan Helena Liu, Guillaume Lajoie, Konrad Kording, Blake Aaron Richards

Published: 01 Feb 2023, Last Modified: 01 Mar 2023ICLR 2023 posterReaders: Everyone

Keywords: Computational Neuroscience, learning and plasticity, Credit assignment, Imperfect gradient descent, Gradient approximation, Biologically-plausible learning, Neuromorphic computing, Neural networks

TL;DR: We characterize the impact of variance and bias in gradient estimates on learning and generalization and study how network architecture properties modulate these effects.

Abstract: There is growing interest in understanding how real brains may approximate gradients and how gradients can be used to train neuromorphic chips. However, neither real brains nor neuromorphic chips can perfectly follow the loss gradient, so parameter updates would necessarily use gradient estimators that have some variance and/or bias. Therefore, there is a need to understand better how variance and bias in gradient estimators impact learning dependent on network and task properties. Here, we show that variance and bias can impair learning on the training data, but some degree of variance and bias in a gradient estimator can be beneficial for generalization. We find that the ideal amount of variance and bias in a gradient estimator are dependent on several properties of the network and task: the size and activity sparsity of the network, the norm of the gradient, and the curvature of the loss landscape. As such, whether considering biologically-plausible learning algorithms or algorithms for training neuromorphic chips, researchers can analyze these properties to determine whether their approximation to gradient descent will be effective for learning given their network and task properties.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Neuroscience and Cognitive Science (e.g., neural coding, brain-computer interfaces)

Supplementary Material: zip

19 Replies

Loading