The Intrinsic Dimension of Images and Its Impact on Learning

Phil Pope; Chen Zhu; Ahmed Abdelkader; Micah Goldblum; Tom Goldstein

The Intrinsic Dimension of Images and Its Impact on Learning

Phil Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein

Published: 12 Jan 2021, Last Modified: 22 Jun 2025ICLR 2021 SpotlightReaders: Everyone

Keywords: generalization, dimension, manifold, ImageNet, CIFAR

Abstract: It is widely believed that natural image data exhibits low-dimensional structure despite the high dimensionality of conventional pixel representations. This idea underlies a common intuition for the remarkable success of deep learning in computer vision. In this work, we apply dimension estimation tools to popular datasets and investigate the role of low-dimensional structure in deep learning. We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images. Additionally, we find that low dimensional datasets are easier for neural networks to learn, and models solving these tasks generalize better from training to test data. Along the way, we develop a technique for validating our dimension estimation tools on synthetic data generated by GANs allowing us to actively manipulate the intrinsic dimension by controlling the image generation process. Code for our experiments may be found \href{https://github.com/ppope/dimensions}{here}.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We measure the dimensionality of common used datasets, and experimentally investigate whether the links between dimensionality and learning that have been identified in the manifold learning literature describe the behaviors of deep neural networks.

Code: [![github](/images/github_icon.svg) ppope/dimensions](https://github.com/ppope/dimensions)

Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [CIFAR-100](https://paperswithcode.com/dataset/cifar-100), [CelebA](https://paperswithcode.com/dataset/celeba), [ImageNet](https://paperswithcode.com/dataset/imagenet), [MS COCO](https://paperswithcode.com/dataset/coco), [SVHN](https://paperswithcode.com/dataset/svhn)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/the-intrinsic-dimension-of-images-and-its/code)

9 Replies

Loading