- Keywords: dataset condensation, data-efficient learning, image generation
- Abstract: As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing them and training models on them becomes more expensive. This paper proposes a training set synthesis technique for \emph{data-efficient} learning, called \emph{Dataset Condensation}, that learns to condense a large dataset into a small set of informative samples for training deep neural networks from scratch. We formulate this goal as a gradient matching problem between the gradients of a deep neural network trained on the original data and our synthetic data. We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods. Finally we explore the use of our method in continual learning and neural architecture search and show that it achieves promising gains on a tight budget of memory and computations.
- One-sentence Summary: This paper proposes a training set synthesis technique that learns to produce a small set of informative samples for training deep neural networks from scratch in a small fraction of computational cost while achieving as close results as possible.
- Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
13 Replies
Loading