Representation mitosis in wide neural networks

Diego Doimo; Aldo Glielmo; Sebastian Goldt; Alessandro Laio

Representation mitosis in wide neural networks

Diego Doimo, Aldo Glielmo, Sebastian Goldt, Alessandro Laio

Published: 28 Jan 2022, Last Modified: 22 Jun 2025ICLR 2022 SubmittedReaders: Everyone

Abstract: Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN that interpolates its training data will typically improve its generalization performance. Explaining the mechanism behind this ``benign overfitting'' in deep networks remains an outstanding challenge. Here, we study the last hidden layer representations of various state-of-the-art convolutional neural networks and find evidence for an underlying mechanism that we call "representation mitosis": if the last hidden representation is wide enough, its neurons tend to split into groups which carry identical information, and differ from each other only by a statistically independent noise. Like in a mitosis process, the number of such groups, or "clones'', increases linearly with the width of the layer, but only if the width is above a critical value. We show that a key ingredient to activate mitosis is continuing the training process until the training error is zero

One-sentence Summary: We describe a mechanism for benign overfitting of deep neural networks that we call "representation mitosis".

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/representation-mitosis-in-wide-neural/code)

15 Replies

Loading