TL;DR: Neural networks that do a good job of classification project points into more spherical shapes before compressing them into fewer dimensions.
Abstract: In a typical deep learning approach to a computer vision task, Convolutional Neural Networks (CNNs) are used to extract features at varying levels of abstraction from an image and compress a high dimensional input into a lower dimensional decision space through a series of transformations. In this paper, we investigate how a class of input images is eventually compressed over the course of these transformations. In particular, we use singular value decomposition to analyze the relevant variations in feature space. These variations are formalized as the effective dimension of the embedding. We consider how the effective dimension varies across layers within class. We show that across datasets and architectures, the effective dimension of a class increases before decreasing further into the network, suggesting some sort of initial whitening transformation. Further, the decrease rate of the effective dimension deeper in the network corresponds with training performance of the model.