Why CNN Features Are not Gaussian: A Statistical Anatomy of Deep Representations

Why CNN Features Are not Gaussian: A Statistical Anatomy of Deep Representations

CVPR 2026 Workshop HOW Proceedings Track Submission29 Authors

Published: 21 Mar 2026, Last Modified: 23 May 2026HOW 2026EveryoneRevisionsBibTeXCC BY 4.0

Include In Proceedings: Yes, include in CVPR proceedings

Public: Yes,

Keywords: Feature Distribution, Copula Analysis, Long-Tail Distribution, Mechanistic Interpretability

TL;DR: The tail of CNN deep features increases with network depth because models attempt to represent natural image statistics.

Abstract: Deep convolutional neural networks (CNNs) are commonly analyzed through geometric and linear–algebraic perspectives, yet the statistical distribution of their internal feature activations remains poorly understood. In many applications, deep features are implicitly treated as Gaussian when modeling densities. In this work, we empirically examine this assumption and show that it does not accurately describe the distribution of CNN feature activations. Through a systematic study across multiple architectures and datasets, we find that the feature activations deviate substantially from Gaussian and are better characterized by Weibull and related long-tailed distributions. We further introduce a novel Discretized Characteristic Function Copula (DCF-Copula) method to model multivariate feature dependencies. We find that tail-length increases with network depth and that upper-tail dependence emerges between feature pairs. These statistical findings are not consistent with the Central Limit Theorem, and are instead indicative of a Matthew process that progressively concentrates semantic signal within the tails. These statistical findings suggest that CNNs are excellent at noise reduction, yet poor at outlier removal tasks. We recommend the use of long-tailed upper-tail-dependent priors as opposed to Gaussian priors for accurately CNN deep feature density.

PDF: pdf

Submission Number: 29

Loading