An Isotropic Gaussian Perspective on Fully Connected Layers

11 Sept 2025 (modified: 19 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Isotropic Gaussian Distribution, Fully-connected layer, Weight Distribution Learning
TL;DR: We investigate the possibility of representing fully connected layers from an Isotropic Gaussian perspective, reducing parameters from O(n²) to O(n) while preserving performance via GDU and WDU training methods.
Abstract: We investigate the possibility of representing a fully connected layer in neural networks from a simple yet expressive statistical form, especially modeling it with Isotropic Gaussian distributions parameterized by a minimal set of means and variances. This formulation not only provides a new lens for understanding the statistical structure of fully connected layers but also allows us to reduce the memory requirements of their weight parameters from $\mathcal{O}(n^{2})$ to $\mathcal{O}(n)$ in practice. To learn the Isotropic Gaussian distribution of fully connected layers, we propose two distribution learning methods, i.e., Gradient Distribution Update (GDU) and Weight Distribution Update (WDU), which integrate seamlessly with conventional back-propagation. These methods iteratively estimate the best Isotropic mean and variance while ensuring compatibility with other layers and maintaining comparable model performance. Once learned, the fully connected layer is executed by sampling weight parameters from the Isotropic distribution at run-time. The experiments show that fully connected layers can be effectively represented in Isotropic Gaussians, implying the potential of statistical interpretation of widely used neural network operations.
Supplementary Material: pdf
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Submission Number: 3851
Loading