Dataset Condensation with Gradient MatchingDownload PDF

28 Sep 2020 (modified: 19 Feb 2021)ICLR 2021 OralReaders: Everyone
  • Keywords: dataset condensation, data-efficient learning, image generation
  • Abstract: As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing them and training models on them becomes more expensive. This paper proposes a training set synthesis technique for \emph{data-efficient} learning, called \emph{Dataset Condensation}, that learns to condense a large dataset into a small set of informative samples for training deep neural networks from scratch. We formulate this goal as a gradient matching problem between the gradients of a deep neural network trained on the original data and our synthetic data. We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods. Finally we explore the use of our method in continual learning and neural architecture search and show that it achieves promising gains on a tight budget of memory and computations.
  • One-sentence Summary: This paper proposes a training set synthesis technique that learns to produce a small set of informative samples for training deep neural networks from scratch in a small fraction of computational cost while achieving as close results as possible.
  • Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
13 Replies

Loading