How Deep Convolutional Neural Networks lose Spatial Information with trainingDownload PDF

Published: 03 Mar 2023, Last Modified: 29 Apr 2024Physics4ML PosterReaders: Everyone
Keywords: Deep Learning Theory, Convolutional Neural Networks, Curse of Dimensionality, Representation Learning, Feature Learning, Computer Vision, Pooling, Stability, Diffeomorphisms, Gaussian noise, Image Classification, Learning Invariants
TL;DR: Deep nets perform image classification by aggregating information over space. We investigate the mechanisms by which this is achieved and propose a theory for an artificial scale-detection task.
Abstract: A central question of machine learning is how deep nets learn tasks in high di- mensions. An appealing hypothesis is that they build a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural rep- resentation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the net. This loss of sensitivity correlates with performance and surprisingly correlates with a gain of sensitivity to white noise acquired over training. These facts are unexplained, and as we demonstrate still hold when white noise is added to the images of the training set. Here we (i) show empirically for various architectures that stability to diffeomorphisms is achieved due to a combination of spatial and channel pooling; (ii) introduce a model scale- detection task which reproduces our empirical observations on spatial pooling; (iii) compute analytically how the sensitivity to diffeomorphisms and noise scale with depth due to spatial pooling. In particular, we find that both trends are caused by a diffusive spreading of the neuron’s receptive fields through the layers.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2210.01506/code)
0 Replies

Loading