The Intrinsic Dimension of Images and Its Impact on LearningDownload PDF

28 Sep 2020 (modified: 25 Jan 2021)ICLR 2021 SpotlightReaders: Everyone
  • Keywords: generalization, dimension, manifold, ImageNet, CIFAR
  • Abstract: It is widely believed that natural image data exhibits low-dimensional structure despite being embedded in a high-dimensional pixel space. This idea underlies a common intuition for the success of deep learning and has been exploited for enhanced regularization and adversarial robustness. In this work, we apply dimension estimation tools to popular datasets and investigate the role of low dimensional structure in neural network learning. We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images. Additionally, we find that low dimensional datasets are easier for neural networks to learn. We validate our dimension estimation tools synthetic data generated by GANs in which we can manipulate intrinsic dimension.
  • Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
  • One-sentence Summary: We measure the dimensionality of common used datasets, and experimentally investigate whether the links between dimensionality and learning that have been identified in the manifold learning literature describe the behaviors of deep neural networks.
9 Replies